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CHAPTER ONF 



For conturies, teachers have been teachincj and 
students have been doing whatever it is that students do. 
It is only in this century, however, that any systematic 
and sustained attemot has been made to study the nature and 
the results of the interaction between teacher and student, 
and, during this period, progress has been painfully slow. 

The educational psychologist is taced with serious 
difficulties in doing research on human learning and 
oerformance. If research is to be done in a school, the 
cooperation of administrators and teachers must oe 
obtained, and experiments must be tailored to fit the 
organizational structure of the school. Even then, it is 
very difficult to obtain detailed information on student 
oerfornnance over a long period of instruction • it rnay be 
possiblp to obtain an adequate description of social 
processes from a discreet distance, but it seems almost 
impossible to obtain detailed profiles of individual 
student responses in this way. in order to obtain the data 
necessary to investigate cognitive performance, it is 
necessary to record student behavior in great detail. 

Since it. is impractical to maintain teams of research 
workers in a classroom without completely disrupting the 
process to be observed, the systematic investigation of 
oroblem solving behavior has been restricted to the 
labor=^tory. Laboratory research on these isisues has been 
hampered by the difficulty in obtaining adeguate samples of 
subjects willing to work on problem-solving tasks over a 
long period of time. 

The advent of computer-assistea instruction makes it 
possible to circumvent some of these difficulties. When a 
student does problems at a computer terminal, it is 
possible to record a complete profile of his typed 
responses (as well as the time to each response). Since 
the collection of these responses is automated, and 
therefore invisible to the student, it is possible to 
record problem solving behavior over a long period of time 
without disrupting the process being observed. In a 
semester of work in mathematics done at a computer 
terminal, it is relatively easy to obtain complete profiles 
o£ individual student solutions to hundreds of problems. 



This implies a further advantage of using CAI for 
research on problem solving. In a laboratory experiment or 
in classroom observation, the subjects (or students) are 
aware that their efforts are being recorded. It has been 
shown that, under such conditions, subjects tend to modify 
their behavior to fit the expectations of the experimenter 
( Nei sser, 1967) , To the extent that data collection is 
truly invisible, this more subtle source of possible bias 
in the data is also eliminated. 

The use of a CAI curriculum as a context for research 
on cognitive processes still presents serious difficulties 
however. In order to exploit its full potential, we must 
develop techniques for analyzing and interpreting the data 
collected. The principal purpose of this research was to 
develop such techniques for examining the details of 
student proof behavior. 

The traditional tools used to analyze tne results of 
educational and psychological experiments are, of course, 
available and have been used. Regression analysis, for 
example, has been used extensively in investigating the 
effects of curriculum structure on student performance. 
The analysis of variance has been used to compare CAI to 
more traditional types of instruction, and to examine the 
effect oroduced by varying certain conditions within CAI, 

It is clear that the use of such techniques can make a 
valuable contribution to our understanding of student 
behavior, but all of these studies deal with global 
measures of performance. They tell us how well students 
oerform under various conditions; they do not tell us how 
students perform - what they actually do. If the solution 
to a problem requires a sequence of steps rather than a 
single response, then this distinction is of great 
importance. The total time taken to solve a problem or the 
number of errors may be adequate measures of a student's 
overall performance, but they tell us nothing about how 
individual students solve problems. An analysis that makes 
use only of summary measures of performance ignores the 
structure of student solutions, and, so, does not exploit 
the full potential of CAI as a setting for educational 
r esearch. 



In this study a particular type of problem solving 
behavior is investigated. In the following sections, some 
techniques for analyzing the details of student proot 
behavior in a complex CAI setting are developed and then 
used to evaluate a specif ic aspect of the Stanford 
Logic-Instructional System (LIS). 
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LIS is designed to allow stud en ts considerable 
latitude in the construction of proots, ana stuclents work 
at their o«.Fn p?ce ;^nd develoo their own strategies tor 
finding oroofs. Bv measuring the actual variation in a 
sample of proofs collected under ord in ar y opera tin ;j 
conditions, it is oossible to char acter ize tne 
effectiveness of the curriculunn in encouraging diversity in 
the students' approaches to proof con s true t ion. This 
research was motivated by a desire to estimate how much 
var iation ( in the tyoes of proofs generated) actually 
occurs when students work through the current LIS 
curriculum. 



J The dnta collection facilities for LIS store a 

J complete record of each student's typed responses, and it 

is possible to examine the exact sequence of steps for 
every proofs It is possible, therefore, to determine the 
] n unber of classes of equivalent proofs in a sa»nple of 

4 student proofs, but first it is necessary to specify a sat 

of criteria that separates proofs into classes, and so 
T defines what is meant by the statement that two proofs are 

1 equivalent. 

The objective of the initial phase of this study is to 

I formulate such criteria. Five distinct procedures are 

developed e^^ch of which classifies any sample ot proofs 

_ into a set of mutually exclusive and exhaustive subsets, 

thus defining a partition on the sample. The procedures 

i- are essentially definitions of what it means to say that 

two proofs are equivalent or not equivalent. These 
oartitions are then shown to be nested in the sense that if 
two proofs are equivalent under the i-th partition, they 
are also equivalent under the (i+l)--th partition. A 
detailed development of these procedures is presented in 

1 Chapter III. 

The second purpose of this study was to determine the 
] amount of variation that actually occurs in the structure 

.1. of the proofs produced by a sample of college students for 

the problems in the LIS curriculum. The proofs constructed 
f by 23 Stanford Univer sity students for 1 2:^ separate 

[ derivation problems in the LIS curriculum are used tor this 

purpose. In order to determine how this variation is 
distributed through the curriculuni each problem is 
]' analyzed separately. 

For all of the problems included in this study and 
• each set of criteria, the student proofs are assigned to 

equivalence classes. The numbers of classes for the five 
partitions for a problem are taken as separate measures of 
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the variability of the student proofs for that problem. 

The results indicate that there is relatively little 
variability for the earliest problems and considerable 
variability for the later problems. The increase in 
variability through the curriculum is not smooth, Tnere is 
a gradual increase from the first problem considered to tne 
50-th problem (approximately), but even the last of these 
early problems shows relatively little variation among the 
proofs generated. There is then an abrupt increase in 
variability and subsequently a continued gradual increase. 
The rule, Replace Equals{RE), is introduced in the 
curriculum just before the abrupt increase in variability; 
this initial indication of the importance of RE is 
confirmed by the subsequent regression analysis. 

Regression analysis is used to pinpoint variables 
defining structural properties of the problems which 
predict variability among the student proofs. The results 
indicate that relatively simple measures of structural 
complexity ( for example, the number of steps in the 
standard proof for a problem) are good predictors of the 
amount of superficial variation in the sample of proofs, 
such as differences in the order of the steps, but 
relatively poor predicto-s for the more substantial 
variations such as differences in the rules used to 
construct the proof. As the importance of these measures 
of structural complexity systematically decreases from the 
first to the fifth partition, the importance of the number 
of theorems (and axioms), as predictors of variability, 
increases. This analysis is described in Chapter IV, and 
the results of the analyses are presented in Chapters V and 
VI. 

The use of a nested sequence of measures, rather than 
a single measure, makes the detection of this trend 
possible. The results indicate that the regression 
equation which 3est predicts variability is quite sensitive 
to changes in the measure of variability. If a single 
measure of variability (partition) were used, there would 
have been no indication of the sensitivity of the results 
to the definition of equivalence, and it is likely that 
erroneous conclusions would be drawn. For example, if only . 
the first partition had been used, it would seem that' 
theorems are relatively poor predictors of variability; in 
fact, the other four partitions indicate that theorems are 
very important predictors of variability, 

in general, the most significant kinds of variability 
(for example, differences in the rules used to constnact a 
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proof) dep.end on the number and type of rules that are 
available when the proof is done; Replace Equals and the 
theorems ar^^ er;pecially important. Where variability in 
student proof behavior Is desired, the more powerful rulefs 
should be introduced .-^s soon as possible. 

In a third part o£ this study, an a'ttfempt was made to 
identify patterns of proof behavior that characterised 
groups of students over the sample ' 6f problems. This 
attempt tOQk / advantage of trtd fact that inetrl'C functions 
for the set of studi^hts can be easily def-iried in* terms of 
the classification prbc*-dures. - 

The searcsh for patterns in student proof behavilor was 
exploratory in nature. If definable patterhs»^had been 
detected, their properties would have been investigated, 
and further research in this direction would hav<- been 
suggested.. Ih fact, no indication of the exis:tence of 
definable pattern'*? -was detected. • r i : 

The f ailure '6*f * this p»>rt of the ^ study to » yield the 
^^<^-?ired resultfi' waS'^^'hot surprising* The problems in the 
:U.^Sic curricultim iiir'^ qufte heterogeneous, and - differences 
Xi'i proofs from' 'p^roblert to problem are much more pronounced 
than the diffeifenctes between students- for a^ given problem. 
Since these ' '(Efforts failed to reveal any sustantial 
results, and the;. Questions raised here are peripheral to 
the main pxitpose 9f' th®- study y this part of the study is 
not discussed in the fcaiih bodyvof the text. {• The methods 
developed for 'this ijpart of-^' the study, ihowever, make 
possible a more^ sydtetnatic analysis of problem solving 
behavior and should bfe:^; useful in future studies dealing 
with probl ©it solving behavior, so ^ a^ " description of the 
analysis is included as Appendix A. 

Overall, thls-study indicates that the use of formally 
defined partitions over sets of vcomplex behaviors (in this 
case, proofs) 6an provide' an intuitively satisfying and 
fruitful '^echn^ique^ f or i 'eicamihlng the details of complex 
behavior. = ' * iv .j 
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CHAPTEt^ TWO 



I have included this brief descr iption of the 
onerntlon of the Logic Instructional System (LIS) for those 
with no previous experience of it; some discussion of the 
curriculum is also included. The description is far from 
complete, but I hope that it is sufficiently detailed to 
en;=5ble the reader to follow the development in subsequent 
sections. Further discussion of the material included in 
this chapter can be found in James Moloney's dissertation 
(Moloney, 1972) and in several papers by Patrick Suppes 
(SuDOPS, 1965,1970, 1971 ). A new instructional system for 
elementary logic, which has many features in common with 
the system discussed here, is described in detail in a 
recent paper by Adele Goldberg (Goldberg 1971). 

The first part of the curriculum is designed to give a 
thorough introduction to sentential logic. Once the 
«5tudent has acquired an understanding of sentential logic, 
he uses this knowledge in his study of elementary algebra. 
In sentential logic, the approach used is a natural 
deduction treatment in which the students are taught rules 
of inference, such as modus ponens, and proof procedures 
(conditional proof and indirect proof). Some examples of 
the rules of inference are: 



(A) Affirm the antecedent - AA 

From ( 1 ) C -> R 

and (2) 0 

infer (3) R 

(B) Form a conjunction - FC 



Using the rules of inference, the student is asked to 
construct a mathematically valid proof of some specified 
sentence (formula) from a given set of premises. The proof 
consists of a sequence of steps, each of which utilizes one 
of the rules of inference. The computer does not interfere 
with the course of the ' student' s attempt to find a proof as 
long as his steps are valid applications of the rules of 
inference; the computer does act as a proof -checker to 
determine if each new step is valid, and types an error 
message whenever a rule is used incorrectly. This gives 
the student the freedom to construct his own proof, subject 



From ( 1 ) Q 
and (2) R 
infer (3) Q and R 




to the constraint that each step be a correct application 
of some rule. 

In the second part of the curriculum, the student is 
first taught certain rules about the identity relation 
(e.g. adding a term to both sides of the equation) • Then 
he is given a set of axioms for an additive group (i,e, 
commutativity, associativity, and the properties of zero 
and negative numbers). From these axicxns and the set of 
rules, he constructs proofs for a number of theorems about 
addition. In his proofs, he can use any theorems that he 
has already proved as well as the axioms and rules that he 
has learned. 

The remainder of this paper deals exclusively with 
derivation problems, and I shall restrict the following 
discussion of LIS to its derivation mode, ignoring its 
other modes* 

Each derivation problem consists of a formula to be 
derived and a sequence of k (with k possibly equal to 0) 
formulas called premises. The k premises are numbered 
sequentially from 1 to k. The student is required to find 
a sequence of valid steps that lead to the formula to be 
derived; when this formula is generated, LIS types CORRECT 
and continues with the curriculum. 

Essentially what a student does at each step of a 
proof is to give a formal justification of the step that he 
wants to take. These justifications are coded as short 
mnemonics. Most codes require auxilliary information or 
parameters; the student types these as prefixes or 
postfixes to the code name. The prefixes are line numbers 
and specify the lines already in the proof that are to be 
operated on in order to generate the new line. For 
example, the left conjunct rule, LC, requires a single line 
reference, the line niamber of a conjunction already in the 
derivation. 

Postfix niambers can be either occurrence numbers or 
literal numbers. For example, an occurrence number is 
required by the commute disjunction rule to specify which 
disjunction of a complex formula is to be commuted, A 
literal number is required by the niamber definition rule to 
specify the niamber for which a definition is to be 
generated. 

Let us consider a very simple example - problem 
406,6: 
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Derive! Q 
P (1) R 

P (2) R -> c 

Q is the sentence to be derived,- .and .line's (l) and (?) are 

nremises. The number^ 1, is the nlitie^ number of the 

sentence, R, and the number, 2, is-the line number of the 
sentence, R-'>.0. .. , ; 

The student generates .new Lines by maWitig use of the 
rules available to^him. If the student' now typa^. 2.1AA. 
l^kHrCthi* "^'''^^"''^"''^ labeled, O ) . • « The proif theA 



•Dterive:, . q.^, 

J P (1.) R •. 

P. • (2) R -> Q. 

2.1AA (3) Q 



• ' ■ • ■ "1 r!< 



■I' 



■ -CORRECT ^ ... .. . 

AA is a mnemonic for affirm -thie ./antecedent : (modus 
Sho^o n* • • li^ -format , for the use of. this, rule is n.mAA 
where n vs.. the, J in en umber of a..Gondi;tional,. and m i-s t)»e 
line number of the antecedent . of the conditional in line 

i i 'V^^" ^^^^ (2) is a conditional and line (1) 

I l!® antecedent, of .. that iconditional. LIS, therefore. 
? . '^^netates, • as line (j) 

2.1AA is a valid step. of. ; the proof . Since, C, is th4 

, sentence to be-dej:ived, the computer .types CORRECT and the 

proof is coipplete, „.., • 

i '^''^f^^f -^^^ Student types 1.2AA, then 

\ LIS would n.Q^.. accept the.riijtsitrusrtion, and no new line would 

rfn^^Svf^^^'" An..;error.,T,es;sage is typed by the computer 
i* ti" . tfXi^/. case, LINE. 1c,-ISr. NOT A CONDITIONS and LIS then 

I waits tor the student s next instruction. Each instruction 

<n«hf??''S^K ^^'^^ sentence generated is 

Justif.te^.. by, the-. correct-. use ©f a -ruletof inference-, axiom. 
I or theoregiyr.. . •i.-.oa'.- . xr,,n:.>. . r I. . 

©Sopf 49rj:this;fjftjf«cnple requires only - a sinqle 
] •l4l59bWQuaduacs<|pt any. other valid step as'ioell. 

t- t: f ^5 the double .negation rdJte^con 

line 1, for example, then line 3 is generated asii.>;<J; no 

( ■ (3) NOT(NOT R). .v-> r 
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Since this is not the formula to be derived, LIS would wait 
for another instruction. 

As indicated above, lines are numbered consecutively 
as they are generated, and, with one exception, each valid 
instruction generates a new line.. The instruction DLL, 
delete last line, does not generate a new line. Instead, 
i t erases all internal references to the last line 
generated; for LIS, that line no. longer exists (of course, 
the deleted, line is not erased from the student's paper 
copy of the'der ivation ) , The next line generated will have 
the number 6f the laist line deleted. A sequence ot DLL's 
may be useSf 'to delete a sequence of iine^ ting from the 
most rtecently^genWated line and workingr^ backy/ar^s through 
the -* 'dferivactb'n.' 'j'he student, howev^j: , . cannpt delete 
oremises and he cannot delete any line in his derivation 

without previously deleting all subsequen L J.irv(ss* 

•o j S • ^ *. • ■ ♦ •' • * ■ ■ 

In dur'^^exampre, the student^may decide that he aoes 
not 'nfeela line.' (3), and type a DLL, ..as his second 
instruction^ ^' If he then types 2.1AA, his record of the 
derivation woald appear as :^ 

Derive: Q ^ • . 

P ' ( 1 y R ; . • 

P (2) R -> Q ^ / . I ^ 

1DN (3) NO^(NOT R) 

DLL ' ' 

2.'1AA (3) Q' 



CORRECT 



If he had tyoed a second DLL in-s.tead of the AA 
instruction, he "would be told that line ..( 2)-:is a premise, 
and cannot be deleted. He could, however, hay^ typed 2.1AA 
directly after 1DN, and the derivation ,'wo.ijpLd jtjiep appear 



as: 



I 



tOC. . . • .1 
CORRECT 



Derive: 0 ' .., ^ . o :• -v : 
P (1 ) K . . 

p (2) r ->/:qi,. 

1 DN ( 3 ) NOT (KOT'^Fi ) V . 
2.1AA .^^U^)^Q . 



rw ■ 



Line (3) in this derivation does not bring the student 
any closer to a solution, but it is a valid instruction and 
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is accepted by LIS. The four lines listed do constitute an 
acceptable proof, but line (3) is not really used; a 
nr*?cise definition of 'unused line' will be given in the 
next chapter. 

In this example, the use of DLL is a matter of 
convenience, but there are two situations where it may be 
necessary to eliminate some lines from a partially 
completed solution. LIS will not generate more than 31 
lines for any problem. None of the problems in the 
curriculum require more than 31 lines, but a student can 
easily generate 31 lines without completing a derivation by 
producing one or more false starts, when this happens, it 
is necessary to delete some unused lines before continuing 
with the derivation. 

The other situation that requires the deletion of 
lines from a partial solution involves the working premise 
rule, WP. Working premises must be used in conjunction 
with either the conditional proof rule, CP, or the indirect 

proof rule, IP. a brief description of these rules will be 
given before continuing with the discussion. 

WP allows the student to introduce any formula or 
sentence as a working premise. He may then instruct LIS to 
generate new lines from this working premise until he has 
generated the consequent of the conditional that he wishes 
to prove; CP is then used to generate the conditional 
sentence. Alternately, the student may derive a 
contradiction by using a working premise, and then use IP 
to generate the denial of the working premise. 

The use of WP begins a subsidiary derivation that must 
be completed before the solution is completed. The line 
generated by WP and all subsequent lines up to, but not 
including, the next line generated by a CP or an IP, are 
indented on the student's paper copy of the derivation to 
indicate that they are part of the subsidiary proof. 
Generating the formula to be derived in a problem within a 
subsidiary derivation does not constitute a proof for the 
problem; a different problem, with an additional premise, 
has been solved. While the student has a working premise 
that has not been referenced by a CP or an IP step, he is 
still in. a subsidiary proof and cannot complete the proof. 

The student may find that he has introduced a working 
premise that he does not wish to use. Any working premise 
which is not used (with either CP or IP) must be deleted 
before the proof is completed. 
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If many lines must be deleted for either of these 
reasons, it may be more convenient to end the session and 
then begin a new session. The same problem is presented 
again, and the student can then restart it. 

The final point to be discussed here is the use of 
substitution instances for axioms and theorems, A student 
uses an auciom or a theorem by typing its code and then 
hitting the aiter key* LIS then types a statement of the 
axiom or theorem and a list of variables in the theorem 
that require substitution ,and asks that a specific term be 
substituted for each of these variables. 

To use the additive inverse axiom, the student types 
AI, LIS types the statement of the axiom, A+(-A)=0, on the 
same line, and requests the single substitution re<iuired 
for AI by typing A: on the following line. The student 
can then reply with any term. For example, if the student 
wishes to gener?).te for line (n), 6+(-6)=0, he must type the 
number, 6, after the the computer types an A; , 

AI A+{-A)=0 

Aj 6 (n) 6+(-b)=0 

Axioms are introiluced in the same w*iiy that Uie other r\iles 
are introduced. Theorems are presented as derivation 
problems, and become available for use after they have been 
proved, 

I shall conclude tliis discussion -.jI. th an example of a 
oroof for a derivation problem from th^. algebra part of the 
curriculum, A brief explanation of each step is given 
after the solution. Further examples are presented in 
Appendix C, 

406,24: 

DERIVE: A+A=3+3 -> A+A=5 



p 


(1) 


A=3 -> 6=A+A 


p 


(2) 


3+3=A+A -> 3=A 


WP 


(3) 


A=3+3 


DLL 






WP 


(3) 


A+A=3+3 


3CE1 


(4) 


3+3=A+A 


2.4AA 


(5) 


3sA 


5CE1 


(6) 


A=3 


1.6AA 


(7) 


6=A+A 


3,7CP 


(8) 


A+A=3+3 -> 6=A+A 


8CE2 


(9) 


A+A=3+3 -> A+As6 
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CORRECT 



as na^^■^l^^K"'^ ^^Kf^^ P^^'^i^es ami are typed by si./.? 
as part of the problem. The student's first stln is 

^^"-^ This IS a valid stL. and 

^n*^^ Therefore he deletes it in his next 

c^ fc ?^J^®"'^''3 working premise (second line (3)). 

CF IS then used to commute the expressions in line 3) 
(prefix number is 3) around the first equal sign (postfix 
number is 1) to generate line (4). Line (5) is generated 
by applying AA to lines (2) and (4). Lines (6) and (7) are 
generated by using CE and AA respectively. Next 
lo?mul'f?n\.P"°?^ ."^^^ generate the conditional 

?he ffr^I iJn^ ^^l' "^^P 1^"^ references, 

be ^^ f?":^^"^ ^ working premise as it must 

be, and the second line referred to is the line that is to 
oe the consequent of the conditional formula. Since line 
(B) terminates the subsidiary derivation begun in line (3) 

(8) '??n^'rS? ^^"^ tSminates at lin; 

18}. Line (9) is generated by another apolication of the 

o^n^I'^iu* . i"""® formula to be derived and 

^ student is no longer in a subsidiary proof. the 
proof is accepted and LIS types CORRECT. 



17 



I 

I CHAPTER THREE 



i. 



I 
i 



In this chapter, the classification procedures which 
are the basis for this study are described. In section 
3*1, an informal introductory description of the criteria 
is presented. in section 3.2, the procedure is developed 
formally, and in the last section 3.3, an example is 
described in detail. 

Given any two proofs for a dex ivation problem, we want 
to be able to decide that the proofs are equivalent (given 
some set of criteria) or that they are not equivalent; in 
order to do this, we must define a partition on the set of 
proofs* 

It would have been possible to have trained human 
judges make the decisions, but I decided not to use this 
technique for two reasons. First, it is an onerous task to 
examine carefully 25 or 30 separate proofs each consisting 
of 20 or 30 steps. It is difficult to remain consistent 
for a single problem, and it is much more difficult to 
maintain consistency from problem to problem. Second, if 
this orocedure were used^ it would be impossible to specify 
oreciselv the criteria employed. 

With these difficulties in mind, I have decided to 

specify in advance a precise set of criteria for 

classifying proofs. This eliminates the problem of 

maintaining consistency throughout the classification and 

oermits an unambiguous statement of the criteria used in 
obtaining my results. 

Five distinct sets of criteria for classifying proofs 
are defined in section 3.2; each of these sets of criteria 
is shown to define a partition (and thus an equivalence 
relation) on any set of proofs. It is also demonstrated 
that the sequence of partitions is nested in the sense 
that, if two proofs are equivalent, under the i-th 
partition, they are also equivalent under the (i+1)-th 
partition. In the following paragraphs, these results are 
presen ted informal ly. 

3.1 z INTRODUCTION TO THE CLASSIFICATION CRITERIA 

The equivalence relations are defined in terms of 
specific one-to-one mappings (correspondences) of 
components of one proof onto components of another. If a 
mapping of the specified form exists between two proofs, 
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they ^re eauivalent, otherwise they are not. The proof 
elements that are mapped and the nature ot uhe mappings 
vary fron one equivalence relation to another, but in each 
case, the equivalence of two proofs depends on a mapping 
(correspondence) between component parts of the proofs. 

I will begin the discussion with the fifth partition, 
vjhere the criteria for equivalence are least stringent, ana 
work backwards to the first partition, where the criteria 
are most stringent. The nesting of the partitions is a 
consequence of the fact that restrictions are added at each 
level, from the fifth partition to the first. The 
classification procedure is illustrated in section 3.3, 
where the resulting partitions for each of the five sets of 
criteria are presented for a small sample of proofs* 

For the fifth equivalence relation, the set of 
elements of each proof is the set of all rules that appear 
at least once in the used steps of the proof. The »Tiapping 
for this partition requires that the elements mapped onto 
each other be the same rule; two proofs are equivalent if 
they use exactly the same rules. 

The fourth partition also requires that equivalent 
oroofs have the same set of used rules, but imposes the 
additional requirement that the rules occur the same number 
of times in both proofs. Therefore, proofs that are 
equivalent under the fourth partition will also be 
equivalent under the fifth partition; the partitions are 
nested. 

The elements mapped under the remaining partitions are 
the steps of the proof s. Under the third partition , 
equivalent oroofs must contain the same number of steps, 
and the steps mapped onto each other must use the same 
rule. The additional requirements added at the third 
partition are more complicated than those for any of the 
other partitions. The description included here is very 
brief and incomplete in some details. One of the 
requirements of the third partition is that corresponding 
steps have identical arguments (arguments specify how the 
rule is to be applied - see Chapter II and section 3 of 
this chapter ) . The third partition al so places 
requirements on the structure of the proof, on the 
relationship between the steps in the proof. The principal 
requirement, added at this level, is that the steps 
referred to by corresponding steps must correspond. If D 
and D' are equivalent proofs under the third partition, 
d(i) in D corresponds to d'(i') in D\ d(i) refers to d(j), 
andd'(i') refers to d'(j'), then d(J) corresponds to 
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thi^oi^or If ^fv, implies a partial restriction on 

^.en%f^^^^°" the;?a^4r^?f'"s^2;s"^r\^ .Tlev%\^^^^^r^^ 

;r^;tr:ct:rer^°''^'^ ^^-^^ — 



of thr?hirr2nd^'f^^^^°" °^ requirements 

or the third and also requires that the ordinal oosition of 
corresponding steps in equivalent proofs be the Itme. 

; The first partition is defined by the identitv 

f ait'iha; for ^''1"^ Partition s'results fro^Jh'^ 

: ' ■ ^ = ^ mapping for the i-th 

partition imposes all of the conditions of the (i+1 tth 
partition along with additional conditions. ^ 

3^ PEFINITIOM OF THF CLASSIFICATION PROCEDURE 

The development that follows will take as primitives 
L^Si.'Tn^i^'^ °^ behavior evaluated bi th^ 

' cnllpd ?n'^f''''\^r"^ ^^'"^^'^ ^^^^^5 ^^^^^ "'-^its will be 

called instructions. A student constructs solutions to the 

inSucMonr°''r' ^yPi"^ ^ sequence of 

will ^ ^ ''f^^'^ solution to a derivation problem 

of a orool if ?i V^^^''"^^^" °^ 'proof; a formal definition 
or a proof will be presented below. 

acPTT^? instruction is a siring of characters (modified 
of ai ^"^J^-^^^^bl^"!^ spaces) followed bv a carriage return 
rLr^o^^^^^f ^^^"^te^- The carriage return or ente? 
i character signals LIS that the instruction is complete! 

a" instruction has been typed by the student 
1 be%i:L%'|TerinT'^.'" °' three^w.ys; (nst^^ctionfma; 

cat£o?li! ^^^f^ -mutually exclusive and exhaustiv^ 

is oJror m ^^^^ ^''^^^ °^ ^^^^ response. If the response 
is an error message, the instruction will be called an 

inforL'^''^"- J' response is a request for furthe? 

information, the instruction will be called 

r W J^oJna'f ^ (intermediate instruction), if LIS responds 

L^^irir,2^f ^ fo^-^ula. then the instruction is called an 
clr^laa^ " student types 'DLL' followed by^ 

carriage return or enter character, then the system gives 

5„ ^ formula generated. This special type of 

instruction is also classified as an L-instruction. ^ 

L Def U A sequence of instructions is an L-step if and only 

if the last instruction in the sequence is a 

r 
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L-instruction and all previous instructions in the 
sequence are I-instructions, 

Def A sequence of instructions is an E-step if and onlv 

if the last instruction in the sequence is an 

E-instruction and all previous instructions in th«^ 
sequence are I-instructions, 

Def 2j. The formula typed by the system after the last 
instruction in an L-step is said to be generated bv 
the L-steo, 

fief 3i In a sequence of steos, all steps between any WP 
step and the first IP or CP step following the WP st»»p 
are called conditional steps, 

Def At The subsequence of L-steps in a sequence of steps 
is a proof (or derivation) of the line, L, if and only 
if the last L-step in the subsequence generates L and 
is not a conditional step. 



As defined here a student's proof for a problem 
consists only of L-steos, and the subsequent analysis 
treats only these L-steps; E-steps are excluded from the 
definition of proof, and student errors will not be 
included in the following analysis. At this ooint in the 
discussion, the distinction between L-steps and E-steos 
will be dropped, and the term 'step' will be used to 
designate L-steps, 

The sequence of steps that defines a proof generates a 
sequence of formulas with the formula to be derived as the 
last formula in the sequence, LIS associates with each of 
these formulas an integer that identifies it fcr subsequent 
reference. These integers are called labels, a proof, 
then, consists of a sequence of labeled steps (L-steps) in 
which the last step in the sequence of steps generates the 
formula to be proved. 

For the purposes of the following discussion, it will 
be useful to decompose any step into three functional 
components, A step is then viewed as an ordered triple 
consisting of: ^ 

(1) a sequence, possibly null, of numerals (called 
references) that are the labels of some previous steps 
in the proof 
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(2) a string of letters designating one of a finite set of 
rules of derivation 

(3) an argument list, possibly null, which provides 
additional information on how the rule of the step is 
to be apolied 

Further discussion of labels, rules, and argument lists can 
be found in Chapter II. 



Def 5x A step d(i) is said to refer to a step d(j) if the 
label d(j) is equal to a referetice of d(i) • 

Def 6^ There exists a chain of reference from d(i) to d(i) 
iff there 

exists a sequence of steps d' ( 1 ) • • ,d' (k) , such 

that: 

(1) d(i) = d'(1) and d(n) = d'(k) 

(2) for all i=1,..,k-l , d'(i+l) refers to d*(i) 

Def 2i A step,d, in a derivation, D, is said to be used if 
d is the last step in D, or if there axists a chain of 
reference from d to last step. 

Th 2i ^If d is a used step in D, and d refers to d\ then 
d' is a used step in D» 
Pf : Let d" be the last step in D, Since d is^^used in 
D, there exists a chain of reference d.,.d". But d 
refers to d'; so d',d,,,,,d" is also a chain. Since 
there exists a chain from d' to d", d'is a used step 
in D. 

Let S designate a finite set of proofs for sane 
derivation problem* S = {D,D' ,d'\ . . }• 



Def 81 <1>D' iff D and D' are derivations in S, and D is 
identical to to D*. 

Th 3x <1> is an equivalence relation on S. 

Pf: The identity relation is an equivalence relation. 



Definitions 9 and 10 are complicated by the unique 
properties of indirect ProoE (IP). IP is the only rule in 
the set of available rules that require.*:? more than two 
references. For steps with rules that ri=inuire two 
references, the interpretation of the step depends on the 
order of the references. The valid use of AA, for example. 
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requires that that trie first reference be the lalx^l ot an 
imnlication ancl that trio seconJ formula referred to be the 
antocerlent of this con«Utlonalt For I?, the first 
refnrenc»=> mif^t the label of a worKirvj pre:nise, but thts 

onlv re(tnir ^nc^n t on the :;«>cond and thir:3 r?i:orence^; is tluit 
they be the lab^5l>; oC two fopuulas, one oL which is t:^e 
ne';ation of the other. A chan^je in the or-lriC of th*^sM two 
ref prenc-j:? has no ef cocL on the validity of trie step anu no 
effect on the forjnula t^enerated by the step. 

rhe secon-l and third sets of equiVMlence critic i.i 
(Def 9 and Def 10) place rer;tr lotions on the order of the 
rftferenc/s in each step, and it Is desirable that the 
second and third r*^f erences in IP steps be exception -.i to 
these rostr Irtionst In order to do this, a "separate 
restriction on the onlor of the referenci-is is specif ii-c' cor 
IP. 

Def 9:^ T)<Z>D' iff 0 an.'' d' are derivations ix 3, and thert;^ 
exists a mapping of the used steps of D onto the usc3d 
steps of d', with the following properties; let d(m) 
in D map into d'(m') in d' 

(1) if d(m)->d'(m') and d(m) is the n-th step in the 
subse(iuence of used step.^ of D, then d'(rr/) is the 
n-th step in t>^e subsequence of used stP;>s of d\ 

(2) d(m) and d' (rn' ) have tin- same rule, and the sa;t\e 
arcjiu.ient list. 

(3) yf d(m) uses a rule that requirr-s eiUur one or two 
references an.! d(m) refers tod(i), (-1(1) if 
d(m) has onli^ one reference), thea d'(m') rei:ers to 
d'(i'), d'(jM and d(i)->d'(i'), d( j)->d' ( j' ) . 

(4) if d(m) has rule IP, then d(m) refers to d(i), d(j), 
d(k) and d'(m') refers to d'(i'), d'(j'), d'(k'). 
d(i)->d'(i'), and either d( j)->d' ( j' ) , d()c)->d' (k' ) or 
d(j)->d'(k'), d(k)->d'( j'), 

Th 4± <2> iG an equivalence relation. 

Pf: The proof consists of showing that the three 
properties that define an equivalence relation hold 
for <2>? in this proof, numerals used as subscripts 
designate the first, second, or third step referred to 
by some step. 
(A) Symmetry D<2>D 

Define a mapping of D onto D such that d(m)->d(m). 
It is clearly true that properties (1) and (2) of <2> 
hold. If d refers to some sequence of steps d(i), 
i = 1,2,3, then d(i)->d(i); so properties (3) and (4) 
also hold. 
(b) Reflexivity If D<2>D/ then D'<2>D. 

Assume that D<2>D'.Then there exists a mapping. 
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d(m)->d' (m' ) , with properties (1) to (4). ForD'<2>D, 
define the inverse mapping d' (m' )->d(m) . 

Since properties (1) and (2) hold for the mapping 
of D onto D', they hold for the mapping of D' onto D. 

Let d' be any step in d', and let d be the step in 
D that maps into d' under D->D'; under D'->D , d'->d. 
If d in D refers tod(i), i = 1,2, and d->d' in D', 
then d refers tod'(i), i = 1,2 where d( i)->d '( i) , 
i = 1,2 (by property (3) of D->D'). under the inverse 
mapning d'->d, d' refers to d' ( i ' ) , i' = 1 , 2 , and 
d'(i)->d(i), i = 1,2 .Therefore property (3) holds for 
the inverse mapping. 

If d has rule, IP, then d and d' have three line 
references. d refers to d(i), i = 1,2,3 and d' refers 
tod'(i), i = 1,2,3. Under D->D', either d(i)->d' (i) , 
i = 1,2j3, or d(l)->d'(l), Q(2)->d'(3) and 
d(3)->d (2). If the former condition holds, then 
d'(i)->d(i), i = 1,2,3 under D'->D. If the latter 
condition occurs, then d' (1 )->d(l ) , d'(3)->d(2), and 
d (2)->d(3). In either case, property (4) holds for 
D'->D. 

(C) Transitivity If D<2>D' and D'<2>d" then D<2>d". 

Assume that D<2>D' and D'<2>d". Then there exist 
mapoings D->D' and D'->d", with properties (1) to (4). 
Let d be any step in D; d->d' and d'->d". Define a 
new maoping from the used steps of D onto the ussd 
steps of D such that d->d" for all d in D. 

By oroperty ( 1 ) of D->D,' and D'->d", if d is the 
n-th step in D then d" is the n-th step in D*', 
Property (1) holds for the new mapping. 

Since d has the same rule and sequence of arguments 
as d and d' has the same rule and sequence of 
argument as d , d and d" have the same rule and the 
same sequence of arguments. Property (2) holds for 
the new mapping. 

Let d refer to d( i) , i = 1 ^2 , d' refer to d'(i), 
i = 1,2, and d^ refer to d j^i), i = 1,2. Under the 
new mapping d->d , and d(i)->d (i), i^^= 1,2. 

If d has rule, IP, then d,d',and d" all have three 
line ..references. Under D->D' , d(1)->d'(1). and under 
D ->D , d(l)->d (1); under D->D , d(1)->d (1). For 
the second and third references there are four 
possible cases, since there are two cases for D->d' 
and two cases for d'->d". Assume that d(2)->d' (3 ) . 
dS3)""><.(2), and d' (2]->d" ( 2 ) , d'(3)->d"(3). TheA 
d(2)->d (3) and d(3)->d (2), and property (4) holds in 
this case. In a similar fashion, it can be shown that 
property (4) holds in the other three cases as well. 
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Def 1 0; D<3>D' iff D and D' are derivations in 6, and 
there exists a mapping of the used steps of D onto the 
used steps of D', with the foUowinq prooerties: 
d(m)->d'(m;) ^ ^ ^ . 

(1) d(m) and d'(m') have the same rule and the same 
argument list. 

(2) if d(m) uses a rule that requires either one or two 
references and d(m) refers to d( i ) , d(j) (d(i) if d 
has only one reference), then d' (m' ) refers to d(i'), 
d(j') and d(i)->d'(i'), d( j)->d'(j'). 

(3) if d has rule, IP, then d refers to d( i) ,d( j ) ,d( k) and 
d' refers to d ' ( i' ) , d' ( j ' ) , d' ( K' ) . d( i)->d' (i' ) , 
and either d(j)->d' (j'), d(k)->d'(k') or d( j )->d' (k' ) , 
d(k)->d'( J'). 

Th 5; <3>is an equivalence relation on S. 
Pf: The proof of theorem 5 follows the same pattern as 
the proof of theorem 4. 



Derivations on the logic program consist of a sequence 
of steps and each step applies one of a finite set of 
rules. Let R(i) be the i-th rule in the set of rules; i = 
1,..,,M. The order of the rules is not important. 

Def 1 1 ; R(i) is said to occur in D, if some used step in D 
applies R(i). 



Since rules may occur more than once in a derivation, 
we will designate the number of occurrences of R(i) in D by 
N(i), It should be emphasized that the definition of 
occurrence for a rule is restricted to used steps. The 
sequence of numbers, N(i), is a frequency distribution over 
the set of rules. 

Def 12: D<4>D' iff D and D' are derivations in S, and for 
every rule, the frequency of occurrence in D is the 
same as the frequency of occurrence in d'; for 
1 = 1,...,M, N(i) = N'(i). 

Th Si <4> is an equivalence relation on S. 
Pf : 

(A) Identity D<4>D D has the same frequency distribution 
for rules as itself. 

(B) Reflexivtty If D<4>D' then D'<4>D. If D<4>D', then 
N(i) = N'(i), i = 1,...,M. But then d'<4>D. 

IC) Transitivity If D<4>D' and D'<4>d", then D<4>d". 
' Assume that D<4;>D' and D'<4>D . Then N(i) = N'(i), 
and N'(i) = N ( i) , i = 1,..,m. Therefore 
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N(i) = i = 1,..,H ana D<4>d". 

Def 13; D<5>D' iff D and D' are derivations in S, and a 
le rule occurs in the used steps of D iff the same r 

occurs in t:ie used steps of d'; for i = 1,..*>M 

N(i) = 0 iff :>;'(i) = 0. 

Th 7; <5> is an equivalence relation on S. 

Pf: Here the frequency distribution over the set of 
rules is reduced to a set of 0-1 variables, 0(i). Let 
0(i) = 1 if l>.(i) is not equal to 0 and let 0(i) = 0 if 
N(i) is equal to 0, This theorem is then a special 
case of the orevious theorem. 

Th 02 Let D and D' be solutions to some derivation 
oroblen. For i = 1,...,4, if D<i>D' then D<i+1>D'. 

Pf: 

(1) If D<1>D\ then D is identical to d'; D=D'. bince <2> 
is an equivalence relation, D<2>D or D<2>D'. 

(2) If D<2>d', then a napping D->d' exists with properties 
(2) to (4) of definition 9. The weaker mapping of 
definition 10 is defined by these three properties. 
So D<3>D' . 

(3) If D<3>D', then there exists a mapping, D->D', with 
property (1) of definition 11. For every occurrence 
of R(i) in D, here is a corresponding occurrence of P 
in D'. So N(i) = N'(i) for i = 1,..,M , and D<4>D' . 

(4) IF D<4>D', then N(i) =N'(i) for i = 1,..,M. So 
0(i) = 0'(i) for i = 1,..M and D<5>d\ 

3.3 EXAMPL E OF CLASSIFICATIO N OF P ROOFS 

Eight proofs for problem 414035 are included in this 
section to illustrate how the nested classification 
orocedures work. Although these proofs were selected from 
the data u:^ed in this study, they arc not meant to be 
representative of the general data base or even the data 
for this problem. The proofs in this subsample were 
select^id so that the namber of classes would decrease tJ\' 
one or two from each partition to the next. Problem 414035 
was selected for two reasons. First, the proofs generated 
for it show enough differences at each level of equivalence 
to illustrate the procedure. Second, the proofs are short 
enough to permit a relatively clear presentation ot the 
differences without the distraction of too much detail. 
The statement of problem 414035 is; 

414035: 

'AI' stands for the additive INVERSE AXIOK. 
DERIVE: 3+(A+(-A) )=3 
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None of the proofs in Table 1 are identical, but they 
all have certain things in common. Each uses the two 
axioms, AI (additive inverse axiom) and Z (zero axiom), and 
sone subset of the following rules: 

AE - add equal terms to both sides of an equation 

CA - commute addition 

CE - commute around an equal sign 

LT - logical truth 

RK - replace equals 

The similarity in the proofs is not surprisincj since they 
are all proofs for the same formula. 

The equivalence classes under the second partition are 
also defined by paradigm proofs for each class; the 
paradigm oroofs for the second partition are listed in 
Table 2. 

Proofs C and F are now equivalent, in proof c (see 
Table 1 for the original form of proof F), the first step 
is not referred to by any susequent step; the first step is 
an unused step and is eliminated from the proof before the 
comoarisons for the second classification are done. The 
line reference numbers in proof F are also changed to 
reflect the elimination of the first step. Vifhen this is 
done proofs C and F are identical. 

The paradigm proof for proof K is also changed. The 
DLL step and the CE step, that was deleted by the DLL, are 
removed. In this case, no changes in line reference 
numbers are required. in proof B, an unused LT step is 
removed and subsequent line reference numbers are changed. 

The paradigm proofs for the third partition are listed 
in Table 3. Under this partition, proofs A and K are 
equivalent, and proofs C, D and F are equivalent. 
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Thp^^eight oroofs are found in Table 1, and are laLalec: I 
from 'a to 'k" for ease of reference. Under the lirst 
nartition, each oroof in the subsample defines a separate 
eofuivalence class; the eight proofs were chosen so that | 
this v/ould be the case. I 



I 

{ 
\ 
I 
I 
1 
I 



If we examine proofs C and D in Table 2, we see that 
each has four steps. The first two steps in D are 
identical to the first two steps in C but the order is 
reversed; this change in order has no effect on the form of B 
the lines generated. The rules for the last two steps are 
the same in the two proofs, but the line reference numbers 



1 
I 



are different. This difference is aue to the fact that the 
order of the first two steps is different in the t;;o 
oroofs. The third step (CE1) in each proof refers to che 
previous AI step. The fourth step in each proc^f refers to 
the Z step and the CE step in that order, Th^e structure of 
the two proofs is the same, and the apparent differences 
all result froa the arbitrary reversal of the first two 
lines. 

The equivalence* classes for the fourth piartition are 
defined by frequency distributions over the available rules 
( see Table 4) , For convenience, the distributions in 
Tabl;^ 4 and Table 5 are taken over the limited set of rules 
that actually apoear in the subsample. 

In going from the third partition to the fourth 
partition , three classes are combined into one class 
(ApHpBpE), The proofs in this class have the sane number 
of steps and the same frequency distribution over the 
rules. The differences that exist between therse proofs 
(see Table 3) are in the order in which the operations are 
performed, and the lines in the proot that the operations 
are perform<?d on. 

In order to clarify the distinction between the third 
and fourth partitions, I will compare two proofs (A,h ana 
B) that are equivalent under the fourth partition but not 
under the third. The first three steps in the two proocs 
are the same. The rules for the remaining steps are also 
the same, but they are used somewhat differently in the two 
oroof s. 

In both cases, the objective of the third and fourth 
steps is to replace the term, 'o+3", in line 3 by the- term, 

3 • If some line in a proof is of the form A=B, where A 
and B are terms, the replace equals rule, RE, allows the 
substitution of B for any occur ance of A in „the^^ proof. 
Using line ^^2^^ and RE, any occurance of, "3+o" Ccin be 
replaced by 3 , The term that appears in line 3 is, 

0+3 , and^^ RE cannot be used with line 2 to replace this 
term by, 3 • 

The A,H proof resolves this difficulty by usin the CA1 
step (commute addition around the^^first plus siyii in the 
equation) on line 2 to form line 4, "0+3=3", In sj^ep b, 
is used in ^^conjunction with line 4 to replace, 'o+3", in 
line S by, 3 , 

The B proof uses CA3 (commute addition ar<^uad the 
third plus sign) on line 3, changing, "o+3", to, "3+o". RE 
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is then used with line 7. to generate line 5, 

The sixth stt^p is the same for both proofs. This is a 
relatively minor variation but it does indicate a 
difference of approach in pro<iucing profjfs. 

The only change in going from the fourth partition to 
the fifth partition is the inclusion of proof G in the 
A,H,B,E class. Proof G has a useless transformation in the 
third step that is corrected in the fifth step. Thus there 
are two unnecessary uses of the CA rule in proof G. 
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TABLE 21.1 - OUST PARTITION 



• 


• 


.AI. 


(1) 


A+(-A)=0 


• 


• 


.Z.,3 


(2) 


3+0=3 


1. 


• 


• • , 3 


(3) 


(A+(-A) ) +3=0+3 


2. 


• 


.CA1 


(4) 


0+3=3 


3. 


4. 


.RE1 


(5) 


(A+(-A) )+3=3 


5. 


• 


,C\7,* 




3+(A+(-A) )=3 



• 


• 


.LT. ,3 


(1) 


3=3 


• 


• 


. AI . , A 


(2) 


A+(-A):=0 


• 


• 


.Z. , 3 


(3) 


3+0=3 


2. 


• 


.AF. , 3 


(4) 


(A+(-A) )+3=0+3 


4. 


• 


.CA3.» 


(5) 


(A+(-A)) +3=3+0 


5. 


3. 


.RE1 .♦ 


(6) 


(A+(-A))+3=3 


6. 


• 


.CA2.« 


(7) 


3+(A+(-A))=3 


• 


• 


.?..,3 


(1) 


3+0=3 


• 


• 


. AI . , A 


(2) 


A+(-A)=0 


2. 


• 


.CE1 .« 


(3) 


0=A + (-A) 


1. 


3. 


.RE1 


(4) 


3+(A+(-A))=3 


• 


• 


.AI. ,A 


(1) 


A+(-A)=0 


• 


• 


• Z« y 3 


(2) 


3+0=3 


1. 


• 


.CE1 


(3) 


0=A+(-A) 


2. 


3. 


.RE1 .* 


(4) 


3+(A+(-A))=3 



• 


• 


.AI. ,A 


(1) 


A+(-A)=0 


1. 


• 


.A£. ,3 


(2) 


(A+(-A))+3=0+3 


2. 


• 


.CA2.* 


(3) 


3+(A+(-A))=0+3 


• 


• 


• Z* f 3 


(4) 


3+0=3 


4. 


• 


.CA1 


(5) 


0+3=3 


3. 


5. 


.RE1 


(6) 


3+(A+(-A))=3 



I 


• • 


.LT.,3 


(1) 


3=3 


• • 


• Z • y 3 


(2) 


3+0=3 




• • 


.AI. ,A 


(3) 


A+(-A)=0 


• » 


3. . 


.CE1.» 


(4) 


0=A+(-A) 




2. 4. 


.RE1 .* 


(5) 


3+(A+(-A) )=3 



PROOF A 



B 
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TABLE 3,1 CONTINUED 



• • 


.AI. ,A 


(1) 


A+(-A)=0 


1. . 


• AE • , 3 


(2) 


(A+(-A) )+3=0+3 


2. . 


.CA1.» 


(3) 


( (-A)+A)+3=0+3 


3. . 


.CA3.» 


(4) 


( (-A)+A)+3=3+0 


4. . 


.CA1 


(5) 


(A+(-A) )+3=3+0 


5. . 


. CA2 . ♦ 


(6) 


3+(A + (-A))=3+0 




• Z • 1 3 


(7) 


3+0=3 


e\ i\ 


.RE1.* 


(8) 


3+(A+(-A))=3 




.AI. ,A 


(1) 


A+(-A)=0 


1. . 


.AE. ,3 


(2) 


(A+(-A) )+3=0+3 




• Z • 1 3 


(3) 


3+0=3 


3! ! 


.CE1.« 


(4) 


3=3+0 




.DLL.* 






3.* \ 


.CA1.* 


(4) 


0+3=3 


2. 4. 


.RE1.« 


(5) 


(A+(-A))+3=3 


5. . 


.CA2.« 


(6) 


3+(A+(-A) )=3 
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TABLE 3.2 - SECOND PARTITION 







• AI • , A 


( 1 \ 


li^l —A ^ — n 


• 




• Z • 1 3 


(2) 


3+0=3 


1 . 


• 


.AE. , 3 


(3) 


(A+(-A)) +3=0+3 


2. 




.CA1 .♦ 


( 4) 


0+3-3 


3. 


4. 


• RE1 . * 






5. 




.CA2.* 


(6) 


3+ ( A + f —A ))~'^ 


• 


• 


.AI. ,A 


(1 ) 


A+(-A)=0 






• Z. , 3 


I 7 ) 




1 . 




. AE. , 3 


i 3 \ 




3. 




.CA3.* 




f A.t f —A ^ ^ x*) — "^j-H 


4. 


2. 




\ ^ / 


^ M + ^ —A ^ ^ +0 = 0 










J + 1 A + \, — A ; ; = 0 


• 


• 


. Z . y 3 


(1 ) 


3+0=3 






. AT . A 


( 7\ 




2 




1 * 
. \^Ct 1 . 


\ ^ ) 


U=A+ V— A ; 




3. 


. I\ «»i 1 . 


i A \ 

V H / 




• 


* 


AT A 




A + I —A j = U 


• 


* 


7 






1 




. ^ C 1 . ^' 


\ 'J ) 


U + 1 — A j 


2. 


3.* 


.RE1 


(4) 


3+(A+(-A))=3 


• 


* 


JiT & 




A + 1 —A i = 0 


1 


• 


£Tr ■} 


\ ^ ) 


I A+l—A ; ; +3=0+3 




• 




{ J) 


3+lA+(-A} )=0+3 


• 


• 


• <^ • f J 




3 + 0=3 


A 

*r • 




• ^-A 1 • 




U+3=3 


3. 


5. 


.RE1 .* 


(6) 


3+ ( A + f —A \ \ — 


• 




.AI. ,A 


(1 ) 


A+(-A)=0 


1. 




.AE. , 3 


(2) 


(A+(-A) ) +3=0+3 


2. 




.CA1 .* 


(3) 


( (-A)+A) +3=0+3 


3. 




.CA3.* 


(4) 


( (-A)+A)+3=3+0 


4. 




.CA1 .* 


(5) 


(A+(-A) )+3=3+0 


5. 




.CA2.« 


(6) 


3+(A+(-A))=3+0 


• 




.Z. , 3 


(7) 


3+0=3 


6. 


7.* 


.RE1.» 


(8) 


3+(A+(-A) )=3 


• 




. AI . , A 


(1 ) 


A+(-A)=0 


1. 




.AE. ,3 


(2) 


(A+(-A) ) +3=0+3 


• 




• Z. ,3 


(3) 


3+0=3 


3. 




.CA1.» 


(4) 


0+3=3 


2. 




.RE1.* 


(5) 


(A+(-A))+3=3 


5. 




.CA2.* 


(6) 


3+(A+(-A))=3 
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TABLE 3.3 - THIRD PARTITION 



• 


• 




(1) 


A+(-A)=0 


• 


* 


• ^ • y «^ 


( o\ 


j.n— "5 


i 

1 • 


• 








• 






( a\ 






A 




\ 0 J 










i ci\ 
V b ; 


j+^A+^— A ; )—3 


• 


• 


•AI. ,A 


(1) 


A+(-A)=0 


• 


• 




\ ^ ) 




i 

1 • 


• 


• AC** f J 


\ ^ ) 


V A + ^ ~A ^ ) + J=sU + J 


o 






( a\ 




A 




QP1 -N- 


\^ } 


V A + ^ — A ; ; +o=J 




• 


nK o # 
• LA^ • ^ 




A+^— A ; ; =3 


• 


• 


» C« • 1 J 


/ 4 \ 


3+0=3 


• 




AT A 


( o\ 


A X f —A ^ — n 


2. 




.CE1 .♦ 


(3) 


0=A+(-A) 


1. 


3.* 


.RE1 .♦ 


(4) 


3+(A+(-A) )=3 


• 




.AI. ,A 


(1 ) 


A+(-A)=0 


1. 




.AE. , 3 


(2) 


(A+(-A) )+3=0+3 


2. 




.CA2.» 


( 3) 


3+(A + (-A) )=0+3 


• 




• Z • 9 3 


(4) 


3+0=3 


4. 




.CA1.» 


(5) 


0+3=3 


3. 




.RE1 


(6) 


3+(A+(-A) )=3 


• 




.AI. ,A 


(1) 


A+(-A)=0 


1. 




.AE. ,3 


(2) 


(A+(-A) )+3=:0 + 3 


2. 




.CA1.» 


(3) 


((-A)+A)+3=0+3 


3. 




.CA3.» 


(4) 


((-A)+A)+3=:3+0 


4. 




.CA1 


(5) 


(A+(-A)) +3=3+0 


5. 




.CA2.» 


(6) 


3+(A+(-A) )=3+0 


• 




• Z • f 3 


(7) 


3+0=3 


6. 


?! 


.RE1.* 


(6) 


3+(A+(-A))=3 



A,H 



C.F.D 
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TABLE 3,4 - THE FOURTH PARTITION 



AE AI CA CE RE Z 

112 11 A,H,B,E 

1 111 C,F,D 

114 11 G 



TABLE 3.5 - THE FIFTH PARTITION 

AE AI CA CE RE Z 
XXX XX A,H,B,E,G 

X XXX C,F,D 



I 
I 
t 
I 
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CHAPTER FOUR 



Since one objective of the logic program is to develop 
flexibility in the student's approach to the construction 
of proofs, it is desirable to avoid the inclusion of 
derivation problems which encourage stereotyped proof 
behavior. For the future development of this curriculum 
(and similar curricula), it would be useful to know how the 
attributes of derivation problems affect the degree of 
diversity found in proof behavior. The analysis described 
below is designed to identify those characteristics of 
derivation problems which best predict the amount of 
variation found in a sample of proofs for the problems. 
For each problem in the curriculum and for each set of 
criteria, student proofs were classified into equivalence 
classes. The number of different classes occurring for a 
particular problem was used as a measure of variability of 
student proofs for the problem. 

After the sample of proofs had been partitioned, the 
relationship between the number of classes per problem and 
the structural attributes of the problems was investigated 
using multiple linear regression. 

Since linear regression is a commonly used technique, 
the details of this method will not be included here. A 
discussion of the way in vrhich regression analysis was used 
in this study and of the assumptions involved in using 
regression is found in section 4.5. The model assumed in 
all of the analyses is linear: 



Y = 
j 



+ a * X + 
n n, j 



where Y( j) is the value of the dependent variable for the 
j-th problem, X(i,j) is the value of the i-th independent 
variable for the j-th partition, the a(i) are constants, 
and e(j) is the error term for the j-th problem. 

For each of the five measures of variation, a separate 
regr.ession analysis was run, with the number of classes per 
problem as the dependent variable. The independent 
variables used are similar to those used in a previous 
study of the Stanford Logic- algebra curriculum (Moloney, 
1971), and these variables are discussed in section 4.4. 
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4^2 - TH2 SAMPLE OF DATA 

During the summer quarter of 1970, the 
Logic-Instructional system was used as an integral part of 
the introductory logic course (Philosophy 157) at .Stanford 
University. The students were proctored during their 
sessions at the computer terminals by the philosophy 
graduate students who gave the lectures in the course. The 
course consisted of two hours of traditional classroom 
instruction each week in addition to the time spent working 
at the computer terminals. 

The LIS curriculum emphasizes the con struction of 
formal proofs, and it is the behavior of students in 
constructing such proofs that is examined in this 
dissertation. Four ol: the 27 students who enrolled in this 
course failed to complete some parts of the curriculum 
included in this study, and these students have been 
dropped from the analysis. 

The fact that approximately fifteen percent of the 
original sample were dropped because they failed to 
complete a substantial part of the curriculum raises the 
possibility that the results of this study are biased by 
the selection of the more successful students. If we 
assume that there is no interaction effect 
(student-problem) , the elimination of the data for these 
students would tend to affect the results for all problems 
in the same way, but would not bias a comparison between 
problems. Moreover, the inclusion of proofs by students 
who did only part of the curriculum would introduce bias 
into a comparison between problems, because the results for 
some problems would be affectd by these students while 
others would not. So, it is necessary to drop these 
students and accept the possibility of bias arising from 
selection, 

A similar problem of non-rauidom selection arises when 
the full set of 27 students is considered, since these 
students selected themselves for this study by deciding to 
enroll fo Philosophy 157 in the Summer of 1970, The extent 
to which the findings of this study can be generalized to 
other curricula and other student populations will depend 
on the extent to which the tasks and the population in this 
study are representative of the target tasks and 
population. 

Even for the 23 students who completed the part of the 
curriculum included in this study", some data were lost 
because of machine failure; this problem will be discussed 
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in the next section. A relatively complete set of data is 
available from these students for the problems in lessons 
405 to 415, and it is the 127 proof problems in these 
lessons that are considered in the analysis. 

4.3 DEPENDENT VARIABLES (MEASURES OF VARIABILITY) 

In the analysis reported in Chapters V and VI, 
stepwise regression was used to relate five measures of 
variation in the sample of proofs to 17 variables that 
characterize the nature of the problem. A separate 
regression analysis is presented for each of the five 
measures of variation. In this section, the dependent 
variables (measures of variation) are discussed, and in the 
; next section the independent variabl'^s are discussed. 

For each of the problems under consideration there are 
approximately 23 proofs in the sample, and the same 23 
students are used for all problems. The five sets of 
equivalence criteria defined in Chapter III generate a 
nested sequence of five partitions on the sample of proofs 
for each problem. The first dependent variable, CI, is 
defined to be the number of classes under the first 
partition. The veuriables, C2 to C5, are defined to be the 
number of classes under the second to the fifth partitions 
respectively. The full set of proofs for any problem 
generates a single value for each of the five dependent 
variables, the number of classes of proofs for the five 
psurtitions. 

Even for those students who completed the lessons of 
the curriculum included in this study, there was some loss 
of data due to machine failure, and the data lost in this 
way csuinot be recovered. Since the machine failures that 
cause this type of data loss are independent of the 
• students' behavior, the loss is assumed to be random. 

y If no data had been lost, the sample of proofs for the 

23 students and 127 problems in this study would consist of 
2,921 proofs. Out of this number, 51 proofs were lost 
r- because of machine failure. Although the percentage of 

I missing proofs is quite small (1.7 percent), this loss of 

data could be a serious problem. 

{ The definitions of the dependent variables make it 

f. difficult to deal with the poblem of missing data. Failure 

to include the proofs of one or more students cannot 
increase the number of classes found, cind can decrease this 
i nxamber. Missing data, therefore, introduces a bias toward 

lower vedues for all five dependent variables on the 

L 
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problems with an incomplete sample of proof s« It should be 
emphasized that this bias results from the nature of the 
dependent variables, and exists even though the loss of 
data is random* 

There are 89 problems with no missing proofs, 28 
problems with one proof missing, eight problems with two 
missing proofs, one problem with three missing proofs, and 
one problem with four missing proofs. The two problems 
with more than two missing proofs were not used in the 
analysis that follows, and the results for the other 125 
problems were modified to correct for the missing proof s* 

In order to correct for the missing data, some 
assumptions must be made about the functional relationship 
between the number of classes in the sample of proofs and 
the total number of proofs in the sample. Using the 
relationship assumed, the number of classes in a sample of 
21 or 22 proofs can then be extrapolated to a hypothetical 
sample cf 23 proofs. 

The nature of the dependent measures being used in 
this research implies that they are monotonically 
nondecr easing functions of the number of problems in the 
sample because the inclusion of another proof in the set 
being partitioned cannot decrease the number of subsets 
deirined by the partition but can increase this number by 
one. Therefore, the desired functional relationship must 
have a positive slope^ 

As the nvtmber of proofs that have been partitioned 
increases, the probability that an additional proof would 
specify a new class (not fall into a class already 
specified) decreases. So, an accepable candidate for the 
functional relationship between thp. number of classes and 
the number of proofs should have a negative second 
derivative* 

Examination of the set of student proofs for a 
representative sample of problems indicated that the 
relationship between the number of classes in a random 
subset of proofs and the number of proofs in the subset is 
approximated by the following formula: 

B 

CL = A*(SL) (1) 

Where SL is the number of problems in the subset, CL is the 
ntttnber of classes, and A and B are constants that depend on 
the problem* For A positive and B between zero and one. 
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this function meets both of the criteria specified above. 
Since a sample of one proof will always have one class^ A 
is equal to 1^ and (1) formula reduces to: 



B 

CL = (SL) (2) 



The value of B for any problem can be estimated from the 
number of classes in the available set of proofs for the 
problem. Taking the logarithm of both sides of (2) gives: 



In(CL) = B * In(SL) (3) 



and B is then given by: 



In(CL) 

B = (4) 

In(SL) 



Since the actual values of CL and SL are available for each 
problem, an estimate of B for each problem can be obtained 
using (4). The predicted value for a full set of 23 proofs 
can then be calculated from formula (2). 

Using this technique, tables of the predicted values 
of CL for the possible range of the observed values of CL 
have been computed, and are included in Tables 1 and 2. 
Since the observed values of the dependent vsiriables 
(number of classes of proofs) are integers, the corrected 
values for these variables are rounded to integers. The 
final correction criteria are listed in Table 3. 

As a partial check on the impact of this correction 
procedure on the final results, the analyses to be 
discussed in Chapter V were also run without the eight 
problems that are missing two proofs. There were no major 
changes in the results when this was dcxie. The corrected 
values for the dependent variables are used in all of the 
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analyses reported in this paper 

4 - INDEPENDENT VARIABLES 

The set of independent variables used in this study is 
very similar to the set of variables used by James Moloney 
in a previous study of the same curriculum (Moloney, 1971). 
A list of the variables used in the present study is 
included as Table 4 •4. 

The first five variables listed in Table 4.4 quantify 
various types of structural complexity that can appear in 
the problem statements. Since these variables do not play 
a very prominent role in the analysis that follows and 
since the definitions for these variables are clear, they 
will not be discussed further here. In the remainder of 
this work, these variables will be called the 
'problem- structure variables'. 

S13(AV RE), S17(R INF), S18(aV TH), S19(AV AX) , 
S20(TOT R), S21 (PSLI) AND S22(P0SIT) are all defined in 
terms of the problem's position in the curriculum. S13(AV 
RS) is a 0-1 variable and indicates whether the problem 
appears before (S13=0) or after (S13=1), the introduction 
of Replace Equals. S17, S18, and S19 are counts of the 
numbers of rules of inference, theorems, and axioms that 
are availabl e when the problem is reached in the 
curriculum. S20(TOT R) measures the total number of rules 
available when the problem is encountered, and is equal to 
the sum of S17, S18, and S19. S22(P0SIT) is defined as the 
ordinal position of the particular problem in the sequence 
of problems considered in this study; this variable is 
included to check for any general order effect in the 
curriculum. These variables are referred to as the 
' rule-position variables' . 

The last group of variables to be considered are those 
that Moloney calls the 'standard proof variables'; the 
variables in this group are sll(RE), S12(CP), S14(AXI0M), 
S15(THERM), and S16(STEPS). The values of the standard 
proof variables for a problem are defined in terms of a 
* standard' proof for the problem. The standard proofs used 
in this study cu:e those constructed by Moloney; the same 
set of problems were done independently by the present 
author and no changes were found to be necessary. The 
criteria used in constructing these proofs are given by 
Moloney: 
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Several criteria were used by the author in 
generating the standard proofs. First, the 
author worked through the entire set of problems 
included in this study two times. The proofs 
generated the second time through are used as 
standard. An attempt was made to construct 
proofs with a minimal number of lines. Also, 
within the constraint of producing a minimal 
proof, an attempt was made to use rules and 
theorems most recently introduced, wherever 
possible. It is the judgement of the author that 
the great majority of the proofs produced are 
minimal in the sense of containing the least 
possible number of lines. 



Since it is the standard proof variables that dominate the 
discussion in Chapters V and VI, some further discussion of 
these variables is appropriate. 

S16(STEPS) is just the number of steps in the standard 
proof, and functions as a simple measure of the length of 
the problen. The types of steps that appear in the standard 
proof has no effect on this variable. 

SII(RE) is the number of occurrences of the rule. 
Replace Equals, in the standard proof. Replace Equals is an 
important rule in the algebra part of the curriculum because 
it permits the student to replace any expression(A) in a 
formula by an expression (B) that has been shown to be equal 
to expression (a ) • This allows the student to develop parts 
of an equation independently and then to combine these 
partial results into a single formula, thus it provides a 
mechanism for the use of subsidiary derivations. The 
problems included in this study are all drawn from the part 
of the curriculxam dealing with algebra. 

S14(AXI0M) and S15(tHERM) count the number of 
occurrences in the standard proof of axioms and theorems 
respectively. The use of any of the five axioms or six 
theorems is counted as an occurrence ; the axioms ( or 
theorems) have equal weight and no distinction is made 
between them. If an axiom (or theorem) is used more than 
once, each application is counted as an occurrence, if the 
standard proof for a problem uses a particular axiom as the 
rule in two separate steps, another axiom in a third step, 
and none of the remaining steps use axioms, then the value 
of Si 4 for the problem would be three. 
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4^ REGRESSION ANALYSIS 

Since regression analysis Is a standard technique In 
educational research, the statistical theory will not be 
developed here; the way In which regression Is to be used In 
this study and the assumptions made In Interpreting F- ratios 
In regression analysis will be discussed* 

The research reported here Is exploratory. Its primary 
aim Is to determine those quantifiable characteristics of 
proof problems In algebra that account for the amount of 
variation found In a sample of proofs for these problems. 
No attempt Is made to test a preconceived hypothesis, and 
little attention Is given to the coefficients of the linear 
equations that result from the regression analyses. 

The analyses reported In Chapters V and VI examine In 
great detail the relationships found In the data. The 
wphasls is on determining how the variation in the sample 
of proofs is related to the features of the proof problems 
defined by the independent variables. The use of five 
different measures of variability makes it possible to 
examine how the relationship between vcuriability and problem 
type changes as a function of the kind of variability 
mea sured. 

If the F-ratios that appear in the results of the 
regression analyses are to be considered, the validity of 
the assumptions that underlie the usual interpretation of 
these F-ratlos should be examined. The model being used 
here is a simple linear model and the assumptions are that 
the errors are independently emd normally distributed with 
zero mean and constant variance. For the analyses discussed 
in Chapter V (using the full set of problems), there is 
clear indication that the assumption of homogeneity of 
variance is violated. The varieuice seems to be an 
Increasing function of the predicted value of C1 . Attempts 
to eliminate this nonhomogeneity by transforming the 
observed values of C1 failed. 

Among the plots of residuals against the independent 
variables, the strongest indication of this lack of 
homogeneity of variance is found for S22(POSIT); there is an 
abrupt increase in variance just after the introduction of 
the rule. Replace Equals. This discontinuity seems to be a 
property of the curriculum and not a ftinction of the scale 
chosen for the dependent variable. It is unlikely that any 
continuous transformation ( change of scale) for the 
dependent variable will el iminate the nonhomogeneity of 
variance. 
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However, there is no serious violation of homogeneity 
of variance if only the problems that appear after the 
introduction of RE are considered. In Chapter VI, the 
analysis described in this chapter will be repeated, using 
only the problems that appear after the introduction of RE 
and that do not have any axioms. 
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TABLE 4.1 



PROJECTED NUMBER OF CLASSES FOR 23 STUDENTS 
WHEN 21 SOLUTIONS ARE ACTUALLY CLASSIFIED 

THE MODEL USED IS, GIVEN JBY: 



B 



CL = A*{SL) 

WHERE CL IS THE NUMBER OF CLASSES 

SL IS THE NUMBER OF SOLUTipNS 



OBSERVED 

1.00000 
2.00000 
3.00000 
4.0000b' 
5.00000 
6.00000' 
7.00000 
8.00000 
9.00000 ' 
10.00000- 
11.00000' 
12.00000 
13.00000 
14.0000b . 
15.00000 
16.00000 
17.00000 
18.00000 
19.00000 
20.00000 
21.00000 



CORRECTED 

1.00000 
2.04186 
3.10012 
4. 16917 
5.24633 
6. 32999 
7.41908\ 
8.51285 ' 
9.61072 
10.71225 ' 
11 .81708' 
12.92492 ^ 
14,03552 
15. 14866 
16.26423'" 
17.382b0 ": 
18.501^6 
19.62369' 
20.74739 
21. 87285''^' • 
2 3. 00000 • 



EST-B* 

,i 

.00000 
.22767 
.36085 
.45534' 
.52863 • ' 
.58852 • 
.63915 
.68301 
.72170 
.75630 
.78761 
.81619 
.8424ff ' 
.86682 -t-'^-' ' 
,88948"''^''-'-' * 
. 91 06§^^Cif^^"' ' 
.93059 
.94937 
.96713 
.98397 
1.00000' ' ' 



* EST-B IS THE ESTIMATED 



val'ue'^of B 



•..Of.'' 
• 1 
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TABLE 4.2 



PROJECTED NUMBER OF CLASSES FOR 23 STUDENTS 
WHEN 22 SOLUTIONS ARE ACTUALLY CLASSIFIED 

THE MODEL USED IS GIVEN BY: 



CL = A*(SL) 



B 



ERIC ' 



WHERE CL IS THE 






SL IS THE 


NUMBER OF SOLUTIONS 










j OBSERVED 


CORRECTED 


EST B* 


1.00000 


1.00000 


.00000 


1 2.00000 


2.02004 


.22424 


• 3. 00000 


3.04777 


.35542 


4.00000 


4.08054 


.44849 


1 5.00000 


5.11707 


.52068 


1 6.00000 


6.15661 


.57966 


7.00000 


7. 19865 


*62953 


( 8. 00000 


8.24285 


.67273 


1 9.00000 


9,28892 


.71084 


10.00000 


10.33667 


.74492 


; • 11. 00000 


11 . 38594 


.77576 


12.00000 


12.43657 


.80391 


-■ 13.00000 


13.48847 


,82980 


14.00000 


14.54154 


.85378 


(" 15.00000 


15. 59568 


.37610 


i. 16.00000 


16.65084 


.89698 


17.00000 


17. 70695 


.91659 


T- 18.00000 


18.76395 


.93508 


] 19.00000 


19.82180 


.95257 


20,00000 


20.88045 


.96917 


21,00000 


21.93986 


.98495 


1 22.00000 


23,00000 


1.00000 



♦ EST-B IS THE ESTIMATED VALUE OF B 
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TABLE 4.3 



FINA L C ORRECTIO N CRITERIA FO R TH E DEPENDEN T . VAR IABL E 



ONE MISSING PROOF 



NUMBER OF CLASSES FOUND CHANGE 

0-13 0 
14-22 +1 



TWO MISSING PROOFS 
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TABLE 4.4 



LIST OF INDEPENDENT VARIABLF.S 



S6 
S7 

S8 

S9 



S10 
S11 
S12 



(WORDS) 
(SYMBL) 

(LOGCN ) 

(PAR EN) 



(PREMS) 

(RE) 

(CP) 



313 (AV RE) 



314 
315 
316 
317 
318 
S19 
320 



(AXIOM) 
(THERM) 
(3TEP3) 
(R INF) 
(AV TH) 
(AV AX) 
(TOT R) 



S21 (P3LI) 
322 (POSIT) 



NUMBER OF V^ORDS PER PROBLEM 

NUMBER OF SYMBOLS IN THE F0RJ4ULA TO BE 

DERIVED 

NUMBER OF LOGICAL CONNECTIVES IN THE 

FORMULA TO BE DERIVED 

DEPTH OF NESTING OF THE MOST DEEPLY 

NESTED NESTED EXPRESSION IN THE FORMULA TO 

BE PROVED 

NUMBER OF PREMISES 

THE NUMBER OF OCCURRt^^CES OF REPLACE EQUALS 
THE NUMBER OF OCCURRENCES OF CONDITIONAL 
PROOF (CP) 

A 0-1 VARIABLE INDICATING THE 

AVAILABILITY OF REPLACE EQUALS 

THE NUMBER OF OCCURRENCE OF ANY AXI0r4 

THE NUMBER OF OCCURRENCES OF ANY THEOREM 

THE NUMBER OF STEPS IN THE STANDARD PROOF 

THE NUMBER OF RULES OF INFERENCE AVAILABLE 

THE NUMBER OF THEOREMS AVAILABLE 

THE NUMBER OF AXIOMS AVAILABLE 

THE TOTAL NUMBER OF RULES AVAILABLE V;HEN 

THE PROBLEM IS DONE 

THE NUMBER OF PROBLEMS SINCE THE LAST 
INTRODUCTION OF A RULE 

THE ORDINAL POSITION OF THE PROBLEM IN THE 
PORTION OF THE CURRICULUM BEING STUDIED 
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CHAPTER FIVE 



In this chapter, the results of the regression 
analyses for the full set of problems will be examined* A 
separate regression analysis was run for each of the five 
partitions discussed in chapter 3. For the first analysis, 
the number of classes in the first partition of the proofs 
for a problem is taken as the value of the dependent 
variable, C1 , for that problem^ Separate dependent 
variables (C2 - C5) are defined analogously for each of the 
other four partitions, and the regression analyses using 
these dependent variables are discussed in order* In each 
case, the set of 17 independent variables described in 
chapter 4 is used* 

Since the nonhomogeneity of variance discussed in 
chapter IV occurs for all five of the regression analyses 
discussed in this chapter, the F-ratios computed in these 
analyses will not be interpreted* The discussion here 
emphasizes a detailed examination of the results and 
ignores hypothesis testing considerations* 

The means and steuidard deviations for the full set ox 
2 2 variables (5 dependent and 17 independent) are listed in 
Table 5*1, and the correlation matrix is found in Table 
5*2* Variables numbered from 1 to 5 are the dependent 
variables, and variables numbered from 6 to 22 are the 
independent variables* 

Examination of Table 5 * 2 indica tes a number of 
interesting trends* The first five columns contain the 
correlations of the five dependent variables with each 
other* All of these correlations are high (greater than 
(0*69), and the partitions closest in the sequence from one 
to five have the highest correlations* 

The remaining entries in the first five rows are the 
correlations between the five dependent variables and the 
17 independ^t varieibles* Many of the correlations are 
quite high; the largest is 0*85 between C2 and Sll(RE)* 
Variable, S11, also has large correlaticxis with the other 
four dependent variables, and the magnitudes of these 
correlations decrease monotonically as we go from C2 to C5* 

Another indepen den t var iabl e , S 1 6 ( STEPS ) , al so has 
high correlations with the dependent variables, and these 
correlations also decrease monotonically from C2 to C5* 
S11 and 816 are both relatively simple measures of the 
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structural complexity of the steuidard proof for a problem* 
SI 6 is the number of steps in the standard proof, while S11 
is the number of occurrences of the rule, RE, in the 
standard proof. The correlation * between these two 
variables is 0.79. 

S15(THERM) which is also a standard proof' variable, 
displays the opposite pattern; its correlatipn with the 
first dependent variable is relatively small (0.33) but 
increases rapidly from C2 to C5. The correlation of SI 5 
with C5 is 0.68, and is larger than that for any of the 
other independent variables. 

In Figure 5.1, the correlations of SII(RE), 
S 1 6 ( STEPS ) , and S 1 5 ( THERM ) with the f ive dependent 
variables are plotted against the ordinal number of the 
dependent variable (or equivsdently, against the ordinal 
number of the partitions that define the dependent 
variables). The correlations of S11 and S16 decrease most 
rapidly as the definition of the dependent variable changes 
from the third to the fifth partition, while the 
correlation of SI 5 with the dependent variable increases 
most rapidly from the third to the fifth partition. It 
should be noted that the fourth and fifth paurtitions are 
the only partitions that do not depend, at all, on the 
order of the steps in a pro(.:)f ; they depend only on the 
rules that are used in the proof. 

Variables S13(aV RE), S17(R INF), S18{AV TH), 
S19(AV AX), S20(TOT R), and S22(POSIT) also have 
substantial correlations with the dependent variables. The 
pairwise correlations between these variables are generally 
high, and all are highly correlated (> 0.85) with S22. In 
the discussion that follows, these variables will be 
referred to as the 'rule-position' variables. 

The value of the position variable, S22, for a given 
problem, is the ordinal position of the problem within the 
total set being examined. All of the rule-position 
variables are confounded with the position variable, hence 
any contribution that they make to the vairiance accounted 
for by the regression equation may be due to the ordering 
of the problems within the curriculum. 

The correlations for the remaining variables, called 
' problem- structure variables', are relatively small, and 
they will not be discussed in any detail. These variables 
show the same trend as 311 and S16, but the correlations 
are much smaller and the pattern is less regular. 
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5^2 - ANALYSIS BASED THE FIRST PARTITION 

The first dependent variable to be considered is CI. 

°^ classes found in the sample when the firsc 
partition is used to define the dependent variable. The 
outDut from the steOwise regression program (BMD02R) is 
presented for the first four steps of the analysis in 
Tables 5.3A,B,C,D. ^ 

In Table 5.3A, we see that SII(re) is the first 
variable to enter the equation. S11 accounts for 55 
percent of the variance in CI. The table of partial 
correlations that results after S11 has been partialed out 
is worth excimining carefully. 

o../ ^^^^v SII(RE) partialled out, the correlation of 
S1 6 (STEPS) with CI is only 0.31, having dropped from 0.72; 
S11 accounts for most of the variance that could otherwise 
be accounted for by si 6. The correlation of S15(THERl.i) 
with the dependent variable increases from 0.33 to 0.40. 
This increase is partially explained by the low correlation 
of S15 with S11 (0.06); 811 accounts for very little of the 
variance that SI 5 is capable of predicting, while 
eliminating much of the variance not accounted for by SI 5. 
S18(AV TH) also shows a slight increase, but the 
correlations of the other rule-position variables with CI 
^l}. decrease; S11 has a correlation of. 0.42 with 
S22(P0SIT), and is taking out some of the 'rule-position' 
variance. The correlations of the problem-structure 
variables with the dependent variable increase slightlv but 
remain relatively small. 

eoo/r,J™A^®^°"*^ variable to enter the equation is 
S22(P0SIT), and the output for this step is found in Table 
5.3B. The coefficient for S22 is positive, and the 
coefficient for SII(RE) decreases slightly when 522 enters 
the equation. The small magnitude of the coefficient for 
S22 is due to the fact that the position variable has a 
very wide range compared to the dependent variable. 

S22( POSIT) is strongly correlated with the measures of 
the complexity of the set of rules available for any of the 
proof problems. It is not clear how much of the importance 
of this variable is due to the availability of rules and 
how much is due to the fact that curriculum writers tend to 
introduce problems of increasing complexity as the 
curriculxjm progresses (the position effect). 

After the variance accounted for by S22(POSIT) has 
been partialed out, the correlations between CI and all of 
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the rule-nosition variables drop sharply. The correlations 
of S15(THERM) and S14(AXI0M) with the dependent variable 
decrease slightly and the correlation of S1 6 (STEPS) with C1 
increases* 

S 16 (steps) is the third variable to enter the equation 
(Table 5.3C), The addition of S16 to the regression 
equation causes th^ coefficient of S11(RE) to drop to about 
one-third of its value at the previous step. S16 is now 
accounting for a substeuitial part of the variance that had 
previously been accounted for by S11, 

At this point, the variance accounted for by S11(RE), 
S16( steps), and most of the variance accounted for by the 
rule-position variables has been partialed out. The 
largest partial correlations are now found for the 
variables, S6 to 810, which measure the complexity of the 
problem statement. The next independent variable to enter 
the equation is S7(SYMBL). 

Rather than continue this step- by-step examination or 
the results of regression analysis, the nature of the 
relationship between the dependent variable and the first 
three independent variables to enter the equation will be 
examined more closely. The summary table for the analysis 
is found in Figure 5.4. 

A scatterplot of CI against sll(RE) is presented in 
Figure 5.2A. The relationship seems linear, but the 
variance of CI for any value of S11 is large, and there is 
some indication that the variance is not independent of 
811. The plot of residuals (calculated after all the 
variables have entered the equation) against S11 (Figure 
5.2b) confirms these observations. 

Examination of the plot of C1 against S22 (POSIT) in 
Figure 5.3A indicates a very different situation. For 
values of S22 less than 50, both the mean and variance of 
the distribution of CI given S22 have relatively low values 
and seem to be independent of S22. For values of S22 above 
5 5, the mean and variance of the conditional distribution 
of CI, given S22, again appear to be independent of S22, 
but both have much higher values than they did for the 
problems with 822 less than 50. The plot of the 
residual s( computed after all of the variables have entered 
the equation) against S22 in Figure 5. 33 does not contain 
any evidence for a departure from linearity but does show 
clearly the abrupt change in variance*. 
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A possible explanation of this phenomenon becomes 
apparent when the curriculum is examined. Between the 
problem ^22 (POSIT) equal to 53 and the problem with 

S22 equal to 54, Replace Equals (re) is introduced. RE 
permits the student to substitute for any expression (A) , in 
a formula, any other expression(B) that has been slwwn to 
be equal to the expression( A) . After the formula, A=B, has 
been proved, A can be relaced by B in any formula within 
the student s partial proof. This rule greatly increases 
the student s flexibility in the in the order in which he 
uses the available rules to construct a proof; the 
partition that defines C1 is sensitive to these differences 
in order (see Chapter ll for a more detailed discussion of 
RE) . 

Figures 5.4A,B contain the corresponding plots for 
S16( STEPS). Again there is evidence for a basically linear 
relationship and nonhomogeneity of variance. The 
indication in Figure 5.4A of a possible departure from 
linearity is not confirmed by Figure 5.4B. This impression 
of non-linearity is due to the six points in the upper 
right corner of Figure 5.4A. All six of these problems are 
long but straightforward; they do not use any of the more 
difficult rules, and they do not involve the recognition of 
any complicated sequence of the simpler rules; in spite of 
their length, these problems are unusually simple. 

Figure 5. 5 is a frequency histogram for the residuals. 
There is no evidence in this figure of any serious 
departure from normality. Figure 5.6 is a scatterplot of 
the residuals (after all of the independent variables have 
entered the equation) against the predicted value of CI; in 
this figure, there is clear indication that the assumption 
of homogeneity of variance has been violated. The variance 
seems to be an increasing function of the predicted value 
of C1. Attempts to eliminate this nonhomogeneity 
transforming the observed values of C1 failed; a 
logarithmic transformation emd a square-root transformation 
were both used without success. 

Among the plots of residuals against the independent 
variables, the strongest indication of this nonhomogeneity 
of variance is found for S22 (see Figure 5.3B). The 
variance in the residuals is not a smoothly varying 
function as Figure 5.6 indicates, instead there is an 
abrupt increase in variance just after the introduction of 
the rule. Replace Equals. This discontinuity seems to be a 
property of the curriculum and not a function of the scale 
chosen for the dependent variable, it is unlikely that any 
continuous transformation (change of scale) for the 
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dependent variable would eliminate the nonhomogenei ty of 
variance* 

If the analysis is restricted to the problems that 
occur after the introduotion of RE, the nonhomogeneity of 
variance is eliminatecl* Analyses using this restricted set 
of problems are reported in Chapter VI • 

*. ■ • s 

A full interpretation of these resultsi must await the 
discussion of the analyses for the other four partitions, 
but some preliminary observations- are appropriate here. 

The first six variables etc en'teri the ' regression 
equation accoxint for 80 percent of the total variance in 
the dependent variable, and the first three, variabl^^r; 
account . for over 74 percent of . the Vcuriancior. > The simple 
linear moiiel that has been assumed fits the' datai very well. 

S11 (RE). ri» the first variable to enter r the f equation, 
and accounts; for 55 percent, of the total* variance in the 
dep«iden t yar iable^ . The in it ial cor relati-on { 0 • 72 ) of 
S16( STEPS) with the dependent variable is almost as high as 
that (0,75.): f:or :St11.(RE), and the correlation; between these 
two variables J: is 0«79» It seems that these two variables 
are measuring similar ..prppep ties of the. problems^ Both can 
be interpreted a^\ relajltively tslmple measures of the 
complexity of the standard; proof for -a problem. 

Together, Sll(RE) and S16(BTEPS) account for almost 73 
oercent of the variance in therjiependent vatiable^ Since 
the first partition is sensitive to.jninor variations in the 
proofs, including changes in the order of the steps, it in 
not surprising that simple. .meaKares of v comlexity account 
for most of the yariance, in; G1 (nuinberifpf classes under the 
first partition), VJhen ttTp;. depend ^t¥::vari is defined 

in terms of th,e fourth a^nd jfiif th.Pi^rtitlpns, which are not 
senstive to minor variati<ji[vp jLn itihje iproQf3,\ the predictive 
power of 311 (RE) and S16(STEPS) is greatly. diminished* 
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Ll3 z ANALYS IS BASED ON THE S ECOND PARTITION 

The pattern of results for the second partition (with 
the number of classes of proofs under the second 
partition, as the dependent variable) parallels the first. 
The initial correlations are roughly the same. The first 
variable to enter is s11(re). The pattern of partial 
correlations that appears after S11 has been included in 
the equation (see Table 5.5A) is very similar to that for 
the first partition (see Table 5.3A). There is one notable 
exception to this generalization. The correlation of 
S22 (POSIT) with the dependent variable drops more sharply 
when S11 is partialled out than it did for the first 
partition. As a result, SI 5 (THERM) has the highest partial 
correlation in Table 5.5A. The apparent importance of SI 5 
is especially notable, because the first theorem is 
introduced only after eighty percent of the problems 
included in this study have been completed, and 815 has a 
very small range with only three possible values, 0, 1, and 
2. The inclusion of S15 at the second step is due in part 
to the fact that its correlation with S11 is only 0.08. 

The partial correlations after the introduction of si 5 
are shown in Table 5.5B. The pattern that appears is very 
similar to the pattern found after the introduction of s22 
in the previous analysis. Since the correlation between 
S15(THERM) and S22(POSIT) is 0.54, it is not surprising 
that they have a similar effect on the partial correlations 
in the two analyses. The third variable to enter (Table 
5.5C) is S16(STEPS) and the fourth is S12(CP). The sununary 
table for this analysis is found in Table 5.6. 

Figure 5.7 is a pl^^t of the residuals against the the 
predicted value of c2, and Figure 5.8 is a plot of the 
observed values of C2 against 522 (POSIT). The evidence for 
nonhomogeneity of variance is even more pronounced than it 
was in the previous analysis; the explanation is the same 
as it was there. 

The only difference between the first partition and 
the second partition is that the unused steps in proofs are 
not relevant under the second partition. Since the 
correlation between CI (first partition) and C2( second 
partition) is 0.94 it is not surprising that the results 
for this analysis are vary similar to the results for the 
first partition. 

The substitution of S15(THERM) for S22(P0SIT) is worth 
noting. The importance of S15 as a predictor of the 
dependent variable increases consistently as the dependent 
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variabl changes from the second to the fifth partition. 

5.4 - ANALYSIS BASED ON THE THIRD PARTITION 

Since the analysis for the third partition is very 
similar to the two analyses already examined, the results 
are only sketched. The correlation of S1 5 (THERM) with the 
dependent variable increases from 0.30 to 0.36 in changing 
from the second to the third dependent variable. The 
initial correlations for the rule-position variables are 
larger than they were for the second partition. The 
initial correlations of S11(RE) and S16(STEPS) with C3 are 
smaller than they were for the second partition, but they 
are still quite large. 

S11 enters the equation first and has roughly the same 
effect on the partial correlations as it did for the second 
partition. S15(THERM) and S16(STEPS) are the second and 
third variables to enter the equation. The fourth variable 
included is S14 (AXIOM) . The first problem structure 
variable is not introduced until step five, and contributes 
only two percent to the total variance accounted for by the 
regression equation. For reference, the results of this 
anedysis are included in Tables 5 7A,B,C and 5.B. 

The problem of the nonhomogeneity of variance is still 
present and will not be discussed here. The interpretation 
of this analysis will be postponed until the end of this 
chapter where the overall pattern of the results will be 
discussed. 

5.5 z ANALYSIS BASED ON THE FOURTH PARTITION 

For two proofs to be equivalent under the fourth 
partition, they must have identical frequency distributions 
over the set of available rules. In the regression 
analysis with C4 as the dependent variable the general 
pattern of the results changes. S20 (TOT R) is the first 
independent variable to enter the equation. The value of 
S20 for a problem is just the total number of rules, 
including axioms euid theorems, that are available when the 
poblem appears in the curriculum. After the second 
partition, the size of the initial correlation of the 
dependent variable with S11(RE) and S16(STEPS) declines, 
while the correlations of the dependent variable with 
S15(THERM), S14(AXI0M) , and the rule-position variables 
increases. The rate of increase is highest for S15 (THERM) 
but several of the rule-position variables had much larger 
correlations with C1 and C2 than S15 , and some of these 
still have larger correlations with the dependent variable. 
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C4. 

After S20(TOT R) has entered the regression equation, 
the partial correlations for all of the other rule-position 
variables drop sharply (see Table 5.9A). The partial 
correlation for SI 5 also decreases, but the decrease for 
this variable is smaller than that for the other 
rule-position variables. 

The second variable included in the equation is 
SII(RE). With the introduction of S11, the partial 
correlation of S16( STEPS) drops dramatically, and the 
partial correlation of S15 increases by over 50 percent 
(see Table 5.9B). 

The third variable to enter is S15 (Table 5.9C), and 
S14(AXI0M) is the fourth. A summary table for this 
analysis is found in Table 5.10. The homogeneity of 
variance assumption is again violated. 

The fourth partition is the first of the five 
partitions for which the order of the steps in a proof is 
irrelevant, and it is the first partition for which SII(RE) 
is not the first variable to enter the regression equation. 
In changing from the third partition to the fourth, the 
correlation of S11 withe the dependent variable drops from 
0.78 to 0.59. Although 811 still has a prominent position 
in the analysis, it does not dominate the results as it did 
in the previous analyses. 
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5^6 ANALYSIS BASED ON THE FIFTH PARTITION 

In the analysis for the fifth partition, 81 5 (THERM ) 
enters first. Its initial correlation with the dependent 
variable is not much larger than the correlation for some 
of the rule-position variables (S20, S22). The 
correlations of S11 and SI 6 with the dependent variable are 
almost as large as that for S15« 

After SI 5 has been taken into account, the partial 

correlations of S11 and 816 increase. The partial 

correlations of the other rule- position variables decrease 
but remain relatively large. 

The second and third variables added are S11 and 
S14(AXI0M)» The results for this analysis are found in 
Tables 5.11A,B,C and Table 5. 12. 

Figure 5.9 contains the plot of C5 against 822 (POSIT). 
The variance in C5 for the first 50 problems is practically 
zero; there are only four problems in this group with more 
than one class in the sample of student proofs. After the 
point in the curriculum where RE is introduced, there is 
evidence for a systematic dependence of variance on problem 
position Figure 5.10 contains a plot of residuals against 
the predicted value of C5 ; there is again a strong 
indication of non homogeneity of variance. It would seem 
here that the nonhomogeneity has two components: the 
complete lack of variance for the problems with values of 
822 (posit) less than 50, and a gradual increase in variance 
with increasing values of 822 for the remaining problems. 

The initial correlation matrix (Table 5.2) and the 
analyses display a clear pattern. As we proceed from CI to 
C5, the importance of 811 (re) and S16(8TEPS) diminishes and 
the importance of S15(THERM) , 814(AXI0M) , and the 
rule-position variables increase. The remainder of the 
discussion in this chapter will investigate these trends. 

Figures 5. 11A,B,C,D,E contain respectively the plots 
of the dependent variables, CI to C5, against 81 5 (THERM). 
In Figure S.IIA, there is relatively little indication of 
any functional relationship between CI and 815. The 
impression that there is a relationship between the two 
variables grows from one partition to the next. 

Only three values (0,1, and 2) for 815 appear in the 
data. There are 107 problems with 815 equal to zero, 11 
problems with 815 equal to one, and 7 problems with 815 
equal to two. In Figure 5.25A, the range of the 
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conditional distribution of C1 given that S15 is equal to 
zero is 22 (from 1 to 23), covering the entire possible 
range for the dependent variables. The ranges for the 
conditional distributions of C1 given that S15 equals one 
or two are also large, but not as large as the range for 
the problems that do not use theorems. 

One implication of the nested character of the 
partitions is that the number of classes for any problem is 
a non-increasing function of the ordinal number of the 
partition. The value of the dependent variable cannot 
increase from any partition to the next, and can decrease 
(unless it is already equal to one). This property of the 
sets of classification criteria is reflected in the data; 
the means of all three of the conditional distributions of 
the dependent variable decrease as the dependent variable 
changes from one partition to the next (Note that in 
Figures 5. 11A,B,C,D,E, the scale of the dependent variable 
changes) . 

The relationship between the means of the three 
conditional distributions does not cheinge much from one 
partition to the next. In all five plots, the mean of the 
conditional distribution of the dependent variable, given 
S1 5 (THERM), increases as S15 increasesi* The relationship 
seems to be nonlinear with a positive, increasing slope, 
but the small number of problems with two theorems in their 
standard proof makes this hypothesis quite unreliable. A 
single additional problem with S15 equal to two, and with 
low values for the dependent variables, would eliminate 
this impression of nonlinearity. 

The most significant change that occurs from one 
partition to the next is the decrease in the variance of 
the dependent variable when S15 equals zero. By the fifth 
partition (C5), the ranges of the three conditional 
distributions are almost equal* For problems using 
theorems, the nxamber of classes is less sensitive to the 
strictness of the definition of equivalence than for 
problems not using theorems. The large amount of variation 
that appears for some of the problems that do not use 
theorems in their standard proofs rapidly disappears for 
the progressively less strict sets of equivalence criteria. 

Figures 5. 12A,B,C,D,E, containing the plots of the 
five dependent variables against S11 (RE), Indicate the 
nature of these problems. A strong linear relationship 
between C1 and S11 in evident in Figure 5.26A; in general, 
problems with high values for C1 also have high values for 
S11. In the progression to the least strict set of 
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equivalence criteria (the fifth partition - C5), the value 
of the dependent variable decreases for all of the 
problems, but the decrease is greater for the Problems with 
high values for S11. in Figure 5.12B, with C2 as the 
dependent variable, a strong linear relationship is still 
apparent, but in Figure 5.12C this relationship has become? 
obscure. By the fifth partition, Figure 5.12E, the 
existence of any linear relationship is not obvious. 

An examination of the sample of proofs constructed for 
the problems in the curriculum tends to confirm the 
conclusions implicit in these results (specific examples 
will be discussed in Chapter Vii). Problems that require a 
large number of steps (high values for Si 6) and involve 
extensive use of RE (high values for S11) tend to have a 
substantial nurr.ber of superficial differences in the proofs 
generated. Variation in the order in which the rules are 
used is very common for these problems. The first two 
partitions (CI and C2) and to a lesser extent the third 
partition, are sensitive to this type of variation, while 
the fourth and fifth partitions are not. 

Problems that require the use of theorems (S15) tend 
to produce more basic variations in the proofs generated. 
The theorems chosen and the rules used in conjxinction with 
the theorems differ from one student to another. All five 
sets of classification criteria are sensitive to this type 
of variation. 
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TABLE 5.1 



MEANS AND STANDARD DEVIATIONS FOR FULL SET 
VARIABLE MEAN STANDARD DEVIATION 



CLAS1 


1 


8.80800 


6.76053 


CLAS2 


2 


5.84800 


5.99201 


CLASS 


3 


5.00000 


5.06092 


CLAS4 


4 


3.36000 


3.39449 


CLASS 


5 


2.89600 


2.86757 


WORDS 


6 


14.43200 


7.30260 


SYMBL 


7 


12.20000 


6.72501 


LOGCN 


8 


0.26400 


0.46043 


PAR EN 


9 


0.84000 


0.82696 


PREMS 


10 


0.52000 


0.84815 


RE 


11 


0.74400 


1.09915 


CP 


12 


0.23200 


0.42381 


AV RE 


13 


0.58400 


0.49488 


AXIOM 


14 


0.27200 


0.55903 


THERM 


15 


0.20000 


0.52363 


STEPS 


16 


3.76800 


2.56256 


R INF 


17 


16.93600 


2.15056 


AV TH 


18 


0.68000 


1 .50054 


AV AX 


19 


1 .80000 


2.18130 


TOT R 


20 


19.41600 


5.11352 


PSLI 


21 


6.74400 


5.20995 


POSIT 


22 


64.24000 


37.09847 
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TABLE 5.2 



CORRELATION MATRIX FOR FULL SET 



VARIABLE 
NUMBER 
1 
2 

3 

4 
5 



1 

1.000 



2 

0.940 
1 .000 



3 

0.919 
0.973 
1 .000 



4 

0.806 
0.820 

0.882 
1 .000 



5 

0.709 
0.694 

0.774 
0.947 
1.000 



MATRIX CONTINUED 



VARIABLE 
NUMBER 
1 
2 

3 

4 
5 
6 
7 
8 
9 
10 



6 

0.256 
0. 157 

0. 189 
0.122 

0. 090 

1. 000 



7 

0.189 
0.097 
0.131 
0.030 
■0.004 
0.800 
1.000 



8 

-0.066 
-0.058 
-0.093 
-0.159 
-0.211 
0.115 
0.215 
1 .000 



9 

0.266 
0.190 

0.214 
0.118 
0. 105 
0.591 
0.776 
-0. 164 
1.000 



10 

-0.171 
-0.091 

-0.128 
-0.180 
-0.230 
-0.344 
-0.325 

0.059 
-0.398 

1.000 



MATRIX CONTINUED 



VARIABLE 

NUMBER 

1 

2 

3 

4 

5 

6 

7 

8 

9 
10 
11 
12 
13 
14 
1 5 



11 

0.745 
0.825 
0.776 
0.593 
0.447 
0.055 
-0.034 
0.007 
0.043 
0.118 
1.000 



12 

■0.060 
-0.069 
-0.094 
-0.137 
-0.186 
0.134 
0.230 
0.923 
-0.123 
0.021 
0.025 
1.000 



13 

0.648 
0.577 
0.560 
0.546 
0.515 
"0.026 
-0.118 
-0.116 
0.013 
■0.115 
0.574 
-0.074 
1 .000 



14 

0.385 
0.299 
0.305 
0.335 
0.325 
0.030 

-0.032 

-0.187 
0.060 

-0.233 
0.219 

-0.166 
0.412 
1.000 



15 

0.32b 
0.305 
0.362 
0.558 
0.680 
-0.164 
-0.147 
-0.221 
0.056 
-0.200 
0.076 
-0.211 
0.324 
-0.022 
1.000 
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TABLE 5.2 CONTINUED 



MATRIX COHTINUED 
VARIABLE 





1 A 


1 7 

1 / 


J O 


1 y 




1 


n 717 


U . OU7 


U • O 1 D 


U. 31 1 


n c: c •? 






U. 3^3 




(J . 4^o 


n >i "7Q 


3 


0.731 


0.537 


0.295 


0.468 


0.512 


4 


0.495 


0.560 


0.487 


0.579 


0.625 


5 


0.333 


0.547 


0.603 


0.630 


0.676 


6 


0. 199 


0.093 


-0.080 


0.017 


0.023 


7 


0.151 


-0.003 


-0.245 


-0.170 


-0.145 


8 


0.285 


-0.178 


-0.227 


-0.308 


-0.273 


9 


0.146 


0.126 


-0.120 


0.045 


0.037 


10 


0.004 


-0.357 


-0.223 


-0.366 


-0.372 


1 1 


0.789 


0.447 


0. 131 


0.264 


0.339 


1 2 


0.258 


-0.187 


-0.212 


-0.290 


-0.264 


13 


0.343 


0.861 


0.384 


0.699 


0.773 


14 


0.168 


0.444 


0. 153 


0.534 


0.459 


15 


-0.013 


0.370 


0.760 


0.565 


0.619 


16 


1.000 


0.252 


0.016 


0.136 


0. 169 


17 




1.000 


0.438 


0.778 


0.881 


18 






1 .000 


0.670 


0.764 


1 9 








1 .000 


0.950 


2C 










1 .000 



MATRIX CONTINUED 



VARIABLE 



ERIC 



NUMBER 


21 


22 


1 


-0.045 


0.616 


2 


-0.041 


0.525 


3 


-0.060 


0.548 


4 


-0.152 


0.621 


5 


-0.199 


0.648 


6 


0.196 


0.051 


7 


0.263 


-0.094 


8 


0.361 


-0.203 


9 


0.041 


0.055 


10 


0.107 


-0.338 


11 


0.035 


0.417 


12 


0.312 


-0. 197 


13 


-0. 170 


0.852 


14 


-0.197 


0.451 


15 


-0.291 


0.542 


16 


0. 116 


0.232 


17 


-0.096 


0.942 


18 


-0.300 


0.666 


19 


-0.385 


0.896 


20 


-0.293 


0.974 


21 


1.000 


-0. 126 


22 




1.000 
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FIGURE 5.1 



CORRELATIONS BETWEEN DEPENDENT MID INDEPENDENT VARIABLES 



AGAINST THE ORDINAL NUMBER OF THE DEPENDENT VARIABLE 
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TABLE 5.3A 



STEP NUMBER 1 FOR C1 



VARIABLE ENTERED 11 
MULTIPLE R 0.7454 
STD. ERROR OF EST. 4.5247 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 123 



SUM OF SQUARES 
3149. 172 
2518.220 



MEAN SQUARE 
3149. 172 
20.473 



F-KATIO 
153.818 



VARIABLES IN 
VARIABLE 



EQUATION : 

COEFFICIENT 



(CONSTANTS 
STD. 



5.39683 ) 



ERROR 



F TO REMOVE 



RE 


1 1 


4.58491 


0.36968 


153.8182 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS2 


2 


0.86164 


0.3186 


351 .6484 


(1 ) 


CLAS3 


3 


0.81121 


0. 3984 


234.7856 


(1 ) 


CLAS4 


4 


0.67721 


0.6479 


1 03.3474 


(1 ) 


CLAS5 


5 


0.63109 


0.8003 


80.7507 


(1 ) 


WORDS 


6 


0.32348 


0.9970 


14.2575 


(2) 


SYMBL 


7 


0.32289 


0.9988 


14.1999 


(2) 


LOGCN 


8 


-0.10773 


0.9999 


1 .4326 


(2) 


PAREN 


9 


0.35043 


0.9981 


17.0792 


(2) 


PREMS 


10 


-0.391 08 


0.9861 


22.0288 


(2) 


CP 


12 


-0.1181 0 


0.9994 


1 .7257 


(2) 


AV RE 


13 


0.40453 


0.6710 


23.8705 


(2) 


AXIOM 


14 


0.34103 


0.9519 


16.0561 


(2) 


THERM 


15 


0.40457 


0.9943 


23.8768 


(2) 


STEPS 


16 


0.31366 


0. 3774 


1 3. 3124 


(2) 


R INF 


17 


0.46326 


0.8004 


33.3360 


(2) 


AV TH 


18 


0.32916 


0.9829 


14.8243 


(2) 


AV AX 


19 


0.48899 


0.9301 


38.3384 


(2) 


TOT R 


20 


0.50090 


0.8850 


40.8629 


(2) 


PSLI 


21 


-0.10685 


0.9988 


1 .4090 


(2) 


POSIT 


22 


0.50386 


0.8261 


41.5112 


(2) 
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TABLE 5.3B 



STEP NUMBER 2 FOR CI 



VARIABLE ENTERED 22 
MULTIPLE R 0.8176 
STD. ERROR OF EST. 3.9244 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F- RATIO 

REGRESSION 2 3788.482 1894.241 122.995 

RESIDUAL 122 1878.910 15.401 



VARIABLES IN EQUATION: {CONSTANT= 1.77608 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 3.63703 0.35277 106.2927 (2) 

POSIT 22 0.06734 0.01045 41.5112 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS2 


2 


0.84628 


0.2791 


305.3467 { 


1) 


CLAS3 


3 


0.77260 


0.3376 


179.1791 { 


1) 


CLAS4 


4 


0.56538 


0.4789 


56.8495 { 


1) 


CLAS5 


5 


0.48513 


0.5420 


37.2435 { 


1) 


WORDS 


6 


0.35670 


0.9960 


17.6399 { 


2) 


SYMBL 


7 


0.42634 


0.9912 


26.8789 { 


2) 


LOGCN 


8 


0.00735 


0.9488 


0.0065 { 


2) 


PAREN 


9 


0.38235 


0.9965 


20.7176 { 


2) 


PREMS 


10 


-0.22391 


0.8041 


6.3863 ( 


2) 


CP 


12 


-0.00345 


0.9472 


0.0014 ( 


2) 


AV RE 


13 


-0.02069 


0.2165 


0.0518 ( 


2) 


AXIOM 


14 


0.17291 


0.7951 


3.7292 { 


2) 


THERM 


15 


0.16912 


0.6787 


3.5626 ( 


2) 


STEPS 


16 


0.47127 


0.3661 


34.5456 ( 


2) 


R INF 


17 


-0.01520 


0.1095 


0.0279 1 


2) 


AV TH 


18 


-0.01999 


0.5305 


0.0484 1 


2) 


AV AX 


19 


0.09758 


0.1829 


1.1632 1 


2) 


TOT R 


20 


0.05289 


0.0466 


0.3394 ( 


2) 


PSLI 


21 


-0.03409 


0.9750 


0.1408 ( 


2) 
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TABLE 5«3C 



STEP NUMBER 3 FOR CI 



VARIABLE ENTERED 16 
MULTIPLE R 0.8615 
STD. ERROR OF EST, 3.4756 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 3 4205.775 1401.925 1 16.058 
RESIDUAL 121 1461.617 12.079 



Variables in 
variable 

RE 11 
STEPS 16 
POSIT 22 



EQUATION : 

COEFFICIENT 
1.32601 
1.18310 
0.07691 



(constants -1 .57704 ) 

STD. ERROR F TO REMOVE 
0.50221 6.9716 (2) 

0.20129 34.5456 (2) 

0. 00940 66.9611 (2) 



VARIABLES NOT 
VARIABLE 
CLAS2 2 
CLAS3 3 
CLAS4 4 
CLASS 5 
WORDS 6 
SYMBL 7 
LOGCN 8 
PAREN 9 

PREMS 1 0 

CP 12 

AV RE 13 

AXIOM 14 

THERM 15 

R INF 17 

AV TH 18 

AV AX 19 

TOT R 20 

PSLI 21 



IN EQUATION: 
PARTIAL CORR. 
0.7991 3 
0.71970 
0.55845 
0.51 081 
0.27328 
0.34745 

-0.24721 
0.33706 

-0.12535 

-0.21275 
0.04882 
0.15938 
0.20591 
0.01525 

-0.00390 
0.07102 
0.06583 

-0.10357 



TOLERANCE 

0.2082 

0.2802 

0.4651 

0.5389 

0.9269 

0.9132 

0.7712 

0.9596 

0.7547 

0.8219 

0.2126 

0.791 3 

0.6782 

0.1091 

0.5298 

0.1819 

0.0466 

0.9609 



F TO ENTER 
212.0490 (1) 
128.9437 
54.3861 
42.3653 

9.6852 
16.4753 

7.8108 
15.3808 

1.9155 

5.6890 

0.2867 

3.1275 

5.3130 

0.0279 

0.0018 

0.6083 

0.5223 

1.3011 



(1) 
(1) 
(1) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
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TABLE 5. 3D 



STEP NUMBER 4 FOR CI 



VARIABLE ENTERED 7 
MULTIPLE R 0.8793 
STD. ERROR OF EST,, 3.2726 



ANALYSIS OF VARIANCE: 

DP SUM 
REGRESSION 4 
RESIDUAL 1 20 



OF SQUARES 

4382.222 
1 285. 170 



MEAN SQUARE 

1095 . 55 6 
10.710 



F-RATIO 
102.295 



VARIABLES IN 
VARIABLE 



SYMBL 
RE 

STEPS 
POSIT 



7 
11 
16 

22 



EQUATION: 

COEFFICIENT 
0.18562 
1.75889 
0.95828 
0.07832 



( CONSTANTS 

STD. ERROR 
0.04573 
0.48475 
0.19746 
0.00886 



-3.40 7 09 ) 
F TO 



REMOVE 
16.4753 (2) 
13.1654 (2) 
23.5509 (2) 
78.2004 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERAis'CE 


CLAS2 


2 


0.80379 


0.2030 


CLAS3 


3 


0.70265 


0.2663 


CLAS4 


4 


0.56703 


0.4620 


CLAS5 


5 


0.52462 


0.5372 


WORDS 


6 


-0.00504 


0.3414 


LOGCN 


8 


-0.29910 


0.7646 


PAREN 


9 


0.11373 


. 0.3805 


PREMS 


10 


-0.00155 


0.6586 


CP 


12 


-0.27704 


0.8082 


AV RE 


13 


0.07042 


0.2121 


AXIOM 


14 


0.17299 


0.7912 


THERM 


15 


0.26313 


0.6695 


R INF 


17 


-0.09389 


0. 1002 


AV TH 


18 


0.09157 


0.4966 


AV AX 


19 


0.16595 


0.1721 


TOT R 


20 


0.17178 


0.0435 


PSLI 


21 


-0.20193 


0.9092 



F TO ENTER 

217.2378 (1) 

116.0447 (1) 

56.3933 (1) 

45.1883 (1) 

0.0030 (2) 

11.6915 (2) 

1 .5593 (2) 

0.0003 (2) 

9.8928 (2) 

0.5930 (2) 

3.6708 (2) 

8.8519 (2) 

1.0583 (2) 

1.0064 (2) 

3.3700 (2) 

3.6184 (2) 

5.0587 (2) 
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TABLE 5.4 



SUMMARY TABLE FOR C1 ON THE FULL SET OF PROBLEMS 



STEP VARIABLE 



NUM 


ENT REM 


R 


RSQ 


1 


RE 


11 


0.74540 


0.55562 


2 


POSIT 


22 


0.81760 


0.66847 


3 


STEPS 


16 


0.86150 


0.74218 


4 


SYMBL 


7 


0.87930 


0.77317 


5 


LOGCN 


8 


0.89080 


0.79352 


6 


THERM 


15 


0.89760 


0.80569 


7 


AXIOM 


14 


0.90450 


0.81812 


8 


AV RE 


13 


0.90890 


0.82610 


9 


PAREN 


9 


0.91 100 


0.82992 


10 


WORDS 


6 


0.91 180 


0.83138 


11 


CP 


12 


0.91 260 


0.83284 


12 


PSLI 


21 


0.91300 


0.83357 


13 


R INF 


17 


0.91330 


0.83412 


14 


TOT R 


20 


0.91350 


0.83448 


15 


PREMS 


10 


0.91350 


0.83448 


16 


AV TH 


18 


0.91360 


0.83466 



INCREASE 
IN RSQ 

0.55562 
0.11 285 
0.07371 

0.03099 
0.02036 
0.01216 
0.01243 
0.00798 
0.00382 
0.00146 
0.00146 
0.00073 
0.00055 
0.00037 
0.00000 

0.00018 



F VALUE 
FOR DEL 

153.8182 
41.5112 
34.5456 
16.4753 
11.6915 
7.4174 
7.9252 
5.3051 
2.6769 
0.9327 
0.9ti91 
0. 5098 
0.4201 
0.1295 
0.1035 
0.0532 



LAST REG 
COEFFICNTS 

1.18711 
-0.08912 
1.22277 
0.24390 
-1 .30035 
3.95143 
1 .951 10 
4.61910 
-0.99653 
0.08359 
-1 .56661 
0.11952 
0.33589 
0.40774 
-0.14266 
-0.11983 
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FIGURE 5.2A Cl VS S11 (RE) 
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FIGURE 5.2B CI RESIDUALS ( Y-AXIS ) VS Sll. ( X-AXIS) 
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FIGURE 5.3A CI VS S22( POSIT) 
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FIGURE 5.3B CI RESIDUALS (Y-AXIS^ VS S22 ( X-AXIS ) 
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FIGURE 5.4A £1 VS S16( STEPS) 
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FIGURE 5.4B CI RESIDUALS (Y-AXIS ) VS S16 (X-AXIS ) 
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FIGURE 5.5 RESIDUALS FOR CI ON FULL SET 
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FIGURE 5i6 RESIDUALS (Y-AXIS) VS COMPUTED £1 ( X-AXIS ) 
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TABLE 5.5A 



STEP NUMBER 1 FOR C2 



VARIABLE ENTERED 11 
MULTIPLE R 0.8255 
STD. ERROR OF EST. 3.3960 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 123 



SUM OF SQUARES 
3033.612 
1418.500 



MEAN SQUARE 
3033.612 
11.533 



F-RATIO 
263.048 



VARIABLES IN EQUATION: 



VARIABLE 



(CONSTANTS 



2.50000 ) 



COEFFICIENT 



STD. ERROR F TO REMOVE 



RE 


11 


4.50000 


0.27746 


263.0485 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL OORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.86164 


0.4443 


351 .6484 


(1) 


CLAS3 


3 


0.93263 


0.3984 


815. 0219 


(1) 


CLAS4 


4 


0.72735 


0.6479 


137. 0459 


(1) 


CLAS5 


5 


0.64321 


0.8003 


86.0897 


(1) 


WORDS 


6 


0.19866 


0.9970 


5. 01 25 


(2) 


SYM3L 


7 


0.22315 


0.9988 


6.3933 


(2) 


LOGCN 


8 


-0.11393 


0.9999 


1 .6044 


(2) 


PAREN 


9 


0.27417 


0.9981 


9.9159 


(2) 


PREMS 


10 


-0.33548 


0.9861 


15.4720 


(2) 


CP 


12 


-0.15758 


0.9994 


3.1065 


(2) 


AV RE 


13 


0.22351 


0.6710 


6.4153 


(2) 


AXIOM 


14 


0.21421 


0.9519 


5.8670 


(2) 


THERM 


15 


0.43154 


0.9943 


27.9194 


(2) 


STEPS 


16 


0.40362 


0.3774 


23. 7427 


(2) 


R INF 


17 


0.30920 


0.8004 


12.8968 


(2) 


AV TH 


18 


0.26045 


0.9829 


8. 8783 


(2) 


AV AX 


19 


0. 38593 


0.9301 


21. 3512 


(2) 


TOT R 


20 


0.37298 


0.8850 


19.7140 


(2) 


PSLI 


21 


-0.12433 


0.9988 


1.9155 


(2) 


POSIT 


22 


0.35226 


0.8261 


17.2837 


(2) 
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TABLE 5.5B 



STEP NUMBER 2 FOR C2 



VARIABLE ENTERED 15 
MULTIPLE R 0.8607 
STD. ERROR OF EST. 3.0760 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 2 
RESIDUAL 122 



SUM OF SQUARES 
3297. 779 
1154.333 



MEAN SQUARE 
1648.889 
9.462 



F-RATIO 
1 74.269 



VARIABLES IN EQUATION: 



(CONSTaNT= 



2.01589 ) 



VARIABLE 



COEFFICIENT 



STD. ERROR F TO REMOVE 



RE 


11 


4.39924 


0.25204 


304.6665 


(2) 


THERM 


15 


2.79542 


0. 52905 


27.9194 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.83282 


0.3716 


273.8946 


(1 ) 


CLAS3 


3 


0.91680 


0. 3058 


637.7576 


(1 ) 


CLAS4 


4 


0.65084 


0.3831 


88.9216 


(1 ) 


CLAS5 


5 


0.53160 


0. 3804 


47.6637 


(1 ) 


WORDS 


6 


0.30545 


0.9685 


12.451 0 


(2) 


SYMBL 


7 


0.31983 


0.9780 


13.7872 


(2) 


LOGCN 


8 


-0.02064 


0. 9507 


0.0516 


(2) 


PAREN 


9 


0.27906 


0.9953 


10.2185 


(2) 


PREMS 


10 


-0.27730 


0.9423 


10.0796 


(2) 


CP 


12 


-0.07435 


0.9539 


0.6726 


(2) 


AV RE 


13 


0.08903 


0.5920 


0.9668 


(2) 


AXIOM 


14 


0.25665 


0.9504 


8.5322 


(2) 


STEPS 


16 


0.50798 


0.3721 


42.0828 


(2) 


R INF 


17 


0.17564 


0.6870 


3.851 6 


(2) 


AV TH 


18 


-0.11355 


0.4177 


1.5806 


(2) 


AV AX 


19 


0.19027 


0.6315 


4.5451 


(2) 


TOT R 


20 


0.14303 


0.5307 


2.5270 


(2) 


PSLI 


21 


0.00352 


0.91 18 


0.0015 


(2) 


POSIT 


22 


0.14642 


0. 5638 


2.6509 


(2) 
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TABLE 5.5C 



STEP NUMBER 3 FOR C2 



VARIABLE ENTERED 16 
MULTIPLE R 0.8987 
STD. ERROR OF EST. 2.6605 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES 

REGRESSION 3 3595.649 

RESIDUAL 121 856.463 



MEAN SQUARE 
1198.550 
7.078 



F-RATIO 
169.330 



VARIABLES IN EQUATION: (CONSTANT= -0.42479 ) 

VARIABLE COEFFICIENT STD. ERROR 

RE 11 2.56248 

THERM 15 3.15129 
STEPS 16 0.99152 



0.35733 
0.46086 
0.15284 



F TO REMOVE 
51 .4245 (2) 
46.7565 (2) 
42.0828 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR 


CLAS1 


1 


0.79795 


CLAS3 


3 


0.90031 


CLAS4 


4 


0.66405 


CLAS5 


5 


0.57592 


WORDS 


6 


0.21999 


SYMBL 


7 


0.21586 


LOGCN 


8 


-0.31725 


PAREN 


9 


0.21581 


PREMS 


10 


-0.22143 


CP 


12 


-0.33090 


AV RE 


13 


0.21921 


AXIOM 


14 


0.30548 


R INF 


17 


0.29589 


AV TH 


18 


-0.08413 


AV AX 


19 


0.26132 


TOT R 


20 


0.24136 


PSLI 


21 


-0.06375 


POSIT 


22 


0.24837 



TOLERANCE 

0.3126 

0.2524 

0.3717 

0.3784 

0.9131 

0.9024 

0.7650 

0.9594 

0.9130 

0.8205 

0.5707 

0.9503 

0.6715 

0.4150 

0.6287 

0.5224 

0.8999 

0.5544 



F TO ENTER 
210. 3244 (1) 
513.4358 

94.6522 

59.5561 
6.1030 
5.8648 

13.4296 
5.8619 
6.1869 

14.7554 
6.0573 

12.3505 

11.5139 
0.8555 (2) 
8.7949 (2) 
7.4228 (2) 
0.4896 (2) 
7.8890 (2) 



(1) 
(1) 
(1) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
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TABLE 5,6 



SUMMARY TABLE FOR C2 GS THE FULL SET OF PROBLEMS 



STEP 


VARIABLE 


MULTIPLE 


INCREASE 


F VALUE 


LAST REG 


NUM 


ENT REM 


R 


RSQ 


IN RSQ 


FOR DEL 


COEFFICNTS 


1 


RE 


11 


0.82550 


0.68145 


0.68145 


263. 0485 


1.96598 


2 


THERM 


15 


0.86070 


0.74080 


0.05935 


27.9194 


4.09484 


3 


STEPS 


16 


0.89870 


0.80766 


0.06686 


42.0828 


1.18386 


4 


CP 


12 


0.91030 


0.82865 


0.02098 


14.7554 


-2.31862 


5 


R INF 


17 


0.91750 


0.84181 


0.01316 


9.8915 


0.13165 


6 


SYMBL 


7 


0.92270 


0.85138 


0.00957 


7.5359 


0.18167 


7 


AXIOM 


14 


0.92540 


0.85637 


0.00499 


4.0936 


0.91066 


8 


AV TH 


18 


0.92760 


0.86044 


0.00408 


3.3821 


-0.68172 


9 


PAREN 


9 


0.93020 


0.86527 


0.00483 


4.1235 


-1 . 20807 


10 


AV RE 


13 


0.93080 


0.86639 


0.00112 


0.9606 


2.02866 


11 


PREMS 


10 


0.93160 


0.86788 


0.00149 


1.3115 


-0.32617 


12 


R 3NF 


17 


0.93160 


0.86788 


0.00000 


0.0000 




13 


WORDS 


6 


0.93180 


0.86825 


0.00037 


0.3128 


0.02218 


14 


LOGON 


8 


0.93200 


0.86862 


0.00037 


0.2403 


-0.67460 


15 


TOT R 


20 


0.93200 


0.86862 


0.00000 


0.0443 


0.43478 


16 


PSLI 


21 


0.93200 


0.86862 


0.00000 


0.0161 


0.07729 


17 


POSIT 


22 


0.93210 


0.86881 


0.00019 


0.2098 


-0.07002 


18 


R INF 


17 


0.93220 


0.86900 


0.00019 


0.0541 


0.13185 
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FIGURE 5.7 RESIDUALS (Y-AXIS ) VS COMPUTED C2 (X-AXIS ) 
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FIGURE 5.8 C2 VS S22( POSIT) 
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TABLE 5. 7A 



STEP NUMBER 1 FOR C3 



VARIABLE ENTERED 11 
MULTIPLE R 0,7756 
STD, ERROR OF EST. 3,2074 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 123 



SUM OF SQUARES 
1910.612 
1265.388 



MEAN SQUARE 
1910.612 
'•0.288 



F-RATIO 
165.718 



VARIABLES IN 
VARIABLE 
RE 11 



EQUATION: 

COEFFICIENT 
3.57124 



(CONSTANTS 2,34300 ) 

STD. ERROR F TO REMOVE 
0.26205 185.7180 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


CLAS1 


1 


0.81121 


0.4443 


CLAS2 


2 


0.93263 


0.3186 


CLAS4 


4 


0.83027 


0.6479 


CLAS5 


5 


0,75699 


0.8003 


WORDS 


6 


0,23239 


0,9970 


SYMBL 


7 


0.25048 


0.9988 


LOGON 


8 


-0.15681 


0.9999 


PAREN 


9 


0,28592 


0.9981 


PREMS 


10 


-0,34984 


0.9861 


CP 


12 


-0,17927 


0.9994 


AV RE 


13 


0,22317 


0.6710 


AXIOM 


14 


0,21914 


0.9519 


THERM 


15 


0,48212 


0,9943 


STEPS 


16 


0.30604 


0.3774 


R INF 


17 


0.33766 


0.8004 


AV TH 


18 


0.30959 


0.9829 


AV AX 


19 


0.43118 


0.9301 


TOT R 


20 


0.41934 


0,8850 


PSLI 


21 


-0.13847 


0,9988 


POSIT 


22 


0.39085 


0.8261 



F TO ENTER 
234.7856 (l) 
815,0219 (1) 
270,7244 (1) 
163.7349 (1) 
6.9648 (2) 
8,1666 (2) 
3,0757 (2) 
10,8615 (2) 
17.0132 (2) 
4,0508 (2) 
6.3948 (2) 
6.1542 (2) 
36.9456 (2) 
12.6072 (2) 
15,6997 (2) 
12,9327 (2) 
27,8611 (2) 
26,0303 (2) 
2,3850 (2) 
21,9981 (2) 
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TABLE 5.7B 



STEP NUMBER 2 FOR C3 



VARIABLE ENTERED 15 
MULTIPLE R 0.8332 
STD. ERROR OF EST. 2.8215 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 2 2204.741 1102.371 138.469 

RESIDUAL 122 971.259 7.961 



VARIABLES IN EQUATION: (CONSTANT= 1.83217 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 3.46491 0.23119 224.6214 (2) 

THERM 15 2.94969 0.48528 36.9456 (2) 



VARIABLES NOT IN EQUATION: 
VARIABLE PARTIAL CORR. 

CLAS1 1 0.76904 



CLAS2 


2 


0.91680 


CLAS4 


4 


0.77492 


CLAS5 


5 


0.67509 


WORDS 


6 


0.36348 


SYMBL 


7 


0.36927 


LOGCN 


8 


-0.05831 


PAREN 


9 


0.29772 


PREMS 


10 


-0.28985 


CP 


12 


-0.08929 


AV RE 


13 


0.07017 


AXIOM 


14 


0.27219 


STEPS 


16 


0.41779 


R INF 


17 


0.19246 


AV TH 


18 


-0.09804 


AV AX 


19 


0.21889 


TOT R 


20 


0.16844 


PSLI 


21 


0.00457 


POSIT 


22 


0.16471 



TOLERANCE F TO ENTER 
0.3716 175.1466 (1 ) 



0.2593 


637.7576 


(1) 


0.3831 


181.6814 


(1) 


0.3804 


101.3209 


(1) 


0.9685 


18.4193 


(2) 
(2) 


0.9780 


19.1047 


0.9507 


0.41 28 


(2) 


0.9953 


11.7679 


(2) 


0.9423 


11.0976 


(2) 


0.9539 


0.9724 


(2) 


0.5920 


0.5988 


(2) 


0.9504 


9.6&22 


(2) 


0.3721 


25.5870 


(2) 


0.6870 


4.6544 


(2) 


0.4177 


1.1744 


(2) 


0.6315 


6.0894 


(2) 


0.5307 


3. 5331 


(2) 


0.9118 


0.0025 


(2) 


0.5638 


3.3741 


(2) 
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TABLE 5.7C 



STEP NUMBER 3 FOR C3 



VARIABLE ENTERED 16 
MULTIPLE R 0.8646 
STD. ERROR OF EST. 2.5741 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 3 2374.276 791.425 119.446 

RESIDUAL 121 801.724 6.626 



VARIABLES IN 
VARIABLE 
RE 11 
THERM 15 
STEPS 16 



EQUATION : 

COEFFICIENT 
2.07922 
3.21817 
0.74803 



(CONSTANTS -0.00914 ) 

STD. ERROR F TO REMOVE 
0.34573 36.1688 (2) 

0.44589 52.0915 (2) 

0.14788 25.5870 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR 


CLAS1 


1 


0.72310 


CLAS2 


2 . 


0.90031 


CLAS4 


4 


0. 78554 


CLAS5 


5 


0.71158 


WORDS 


6 


0.29879 


SYMBL 


7 


0. 29004 


LOGCN 


8 


-0.29810 


PAREN 


9 


0.24474 


PREMS 


10 


-0.24173 


CP 


12 


-0.29142 


AV RE 


13 


0.16750 


AXIOM 


14 


0.30546 


R INF 


17 


0.28416 


AV TH 


18 


-0.07074 


AV AX 


19 


0.27260 


TOT R 


20 


0.24455 


PSLI 


21 


-0.04786 


POSIT 


22 


0.24282 



TOLERANCE 


F TO ENTbK 


0.3126 


131 .5064 


(1) 


0.1924 


513.4358 


(1) 


0.3717 


193.3721 


(1) 


0.3784 


123.0865 


(1) 


0.9131 


11 .7629 


(2) 


0.9024 


11 .0216 


(2) 


0.7650 


11.7034 


(2) 


0.9594 


7.6457 


(2) 


0.91 30 


7.4475 


(2) 


0.8205 


11.1367 


(2) 


0.5707 


3.4641 


(2) 


0.9503 


12.3493 


(2) 


0.6715 


10.5409 


(2) 


0.4150 


0.6035 


(2) 


0.6287 


9.6334 


(2) 


0.5224 


7.6331 


(2) 


0.8999 


0.2755 


(2) 


0.5544 


7.5186 


(2) 
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TABLE 5.8 



SUMMARY TABLE FOR C3 ON THE FULL SET OF PROBLEMS 



STEP 



VARIABLE 



MULTIPLE 



INCREASE 



NUM 


ENT REM 


R 


RSQ 


IN RSQ 


1 


RE 


11 


0.77560 


0.60156 


0.60156 


2 


THERM 


15 


0.83320 


0.69422 


0.09267 


3 


STEPS 


16 


0.86460 


0.74753 


0.05331 


4 


AXIOM 


14 


0.87810 


0.77106 


0.02353 


5 


SYMBL 


7 


0.89090 


0.79370 


0.02264 


6 


CP 


12 


0.90120 


0.81216 


0.01346 


7 


PAREN 


9 


0.90480 


0.81866 


0.00650 


8 


AV TH 


18 


0.90720 


0.82301 


0.00435 


9 


AV AX 


19 


0.91090 


0.82974 


0.00673 


10 


LOGCN 


8 


0.91 210 


0.83193 


0.00219 


11 


PREMS 


10 


0.91 270 


0.83302 


0.00109 


12 


R INF 


17 


0.91 290 


0.83339 


0.00037 


13 


WORDS 


6 


0.91 310 


0.83375 


0.00037 


1 4 


PSLI 


21 


0.91310 


0.83375 


0.00000 


15 


POSIT 


22 


0.91 320 


0.83393 


0.00018 


16 


AV RE 


13 


0.9- 340 


0.83430 


0.00037 



F VALUE 
FOR DEL 

185.7180 
36.9456 
25.5870 
12.3493 
13.0033 
11.6125 
4.2352 
2.8131 
4.5963 
1.4453 
0.7322 
0.2761 
0.2529 
0.0155 
0.0595 
0.2126 



LAST REG 
COEFFICNTS 

1.72670 
4.12214 
O.Sbl 29 
0.82494 
0.22774 
-1.17221 
-1.41810 
-0.246Sb 
0.58843 
-1 .4a24U 
-0.199bO 
0.64315 
0.02520 
0.07453 
-U.06B85 
0.89661 
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TABLE 5.9A 



STEP NUMBER 1 FOR C4 



VARIABLE ENTERED 20 
MULTIPLE R 0.6255 
STD, ERROR OF EST, 2.6592 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES 

REGRESSION 1 558,996 

RESIDUAL 123 869.804 



MEAN SQUARE 
558,996 
7,072 



F-RATIO 
79,048 



VARIABLES IN 
VARIABLE 



EQUATION: 

COEFFICIENT 



(CONSTANTS 

STD, ERROR 



-4,70182 ) 



TOT R 


20 


0,41522 


0, 04670 


VARIABLES NOT 


IN EQUATION: 




VARIABLE 


PARTIAL CORR, 


TOLERANCE 


CLAS1 


1 


0,70182 


0,6787 


CLAS2 


2 


0,76067 


0,7716 


CLASS 


3 


0,83827 


0,7379 


CLAS5 


5 


0,91114 


0,5436 


WORDS 


6 


0,13812 


0,9995 


SYMBL 


7 


0,15679 


0,9788 


LOGCN 


8 


0,01529 


0,9254 


PAREN 


9 


0,12225 


0,9986 


PREMS 


10 


0,07217 


0,8616 


RE 


11 


0,51945 


0,8850 


CP 


12 


0,03775 


0,9301 


AV RE 


13 


0,12590 


0,4021 


AXIOM 


14 


0,06831 


0,7889 


THERM 


15 


0,27869 


0,6166 


STEPS 


16 


0,50709 


0,9716 


R INF 


17 


0,02415 


0,2239 


AV TH 


18 


0,01788 


0,4167 


AV AX 


19 


-0,06170 


0,0969 


PSLI 


21 


0,04150 


0,9144 


POSIT 


22 


0,06814 


0,0520 



F TO REMOVE 
79,0482 (2) 



F TO ENTER 
1 18,4180 (1 ) 
167,5221 (1) 
288,3557 (1) 
596,3591 (1) 
2,3725 (2) 
3,0747 (2) 
0,0285 (2) 
1,8509 (2) 
0,6387 (2) 
45,0829 (2) 
0,1741 (2) 
1,9650 (2) 
0,5720 (2) 
10,2733 (2) 
42,2307 (2) 
0,0712 (2) 
0,0390 (2) 
0,4663 (2) 
0,2105 (2) 
0,5691 (2) 
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TABLE 5.9B 



STEP NUMBER 2 FOR C4 



VARIABLE ENTERED 11 
MULTIPLE R 0.7453 
STD. ERROR OF EST. 2.2816 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 2 
RESIDUAL 1 22 



SUM OF SQUARES 
793.689 
635.111 



MEAN SQUARE 
396.845 
5.206 



F-«ATIO 
76.231 



VARIABLES IN 
VARIABLE 



EQUATION: 

COEFFICIENT 



(CONSTANTS 

STD. ERROR 



-3.80896 ) 



RE 


11 


1.33047 


0.19815 


TOT R 


20 


0.31825 


0.04259 


/ARIABLES NOT 


IN EQUATION: 




VARIABLE 


PARTIAL CCr ll. 


TOLERANCE 


CLAS1 


1 


0.55316 


0.3328 


CLAS2 


2 


0.67451 


0.2743 


CLAS3 


3 


0.79170 


0.3284 


CLAS5 


5 


0.922C0 


0.4899 


WORDS 


6 


0.13115 


0.9969 


SYMBL 


7 


0.17381 


0.9786 


LOGCN 


8 


-0.04940 


0.9142 


PAREN 


9 


0.12321 


0.9976 


PREMS 


10 


-0.08903 


0.7943 


CP 


12 


-0.03268 


0.9153 


AV RE 


13 


-0.19930 


0.2925 


AXIOM 


14 


0.03388 


0.7844 


THERM 


15 


0.44406 


0.5962 


STEPS 


16 


0.18510 


0o3664 


R INF 


17 


-0.18443 


0.1991 


AV TH 


18 


0.15258 


0.3982 


AV AX 


19 


0.04883 


0.0931 


PSLI 


21 


-0.04256 


0.8940 


POSIT 


22 


-0.18213 


C.0435 



F TO REMOVE 
45.0829 (2) 
55.8293 {2, 



e' TO £NT£.R 

53. 3476 (1) 
101.0027 (1) 
203.2142 (1) 
686.1566 (1) 
2.1175 (2) 
3.7691 (2) 
0.2960 (2) 
1.8652 (2) 
0.9667 (2) 
0.1294 (2) 
5.C047 (2) 
0.1391 (2) 
29.7202 (2) 
4.2930 (2) 
4.2609 (2) 
2.8843 (2) 
0.2892 (2) 
0.2196 (2) 
4.1512 (2) 
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TABLE 5.9C 



STEP NUMBER 3 FOR C4 



VARIABLE ENTERED 15 
MULTIPLE R 0.8020 
STD. ERROR OF EST. 2.0528 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 3 918.926 306.309 72.6i)1 

RESIDUAL 121 509.874 4.214 



VARIABLES IN EQUATION: (C0NSTANT= -1.12523 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 1.51015 0.16130 69.3844 (2) 

THERM 15 2.48564 0.45595 29.7202 (2) 

TOT R 20 0.14754 0.04949 8.8880 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.55736 


0.3272 


54.0763 


(1) 


CLAS2 


2 


0.64216 


0.2540 


84.2096 


(1) 


CLAS3 


3 


0. 76820 


0.2971 


172.7806 


(1) 


CLAS5 


5 


0.90518 


0.3443 


544.2899 


(1) 


WORDS 


6 


0.26274 


0.9480 


8.8981 


(2) 


SYMBL 


7 


0.22974 


0.9737 


6.6866 


(2) 


LOGCN 


8 


-0.03065 


0.9119 


0.1128 


(2) 


PAR EN 


9 


0.11 340 


0.9952 


1 .5633 


(2) 


PREMS 


10 


-0. 14879 


0.7866 


2.71 68 


(2) 


CP 


12 


-0.01658 


0.9138 


0.0330 


(2) 


AV RE 


13 


-0.09770 


0.2730 


1.1565 


(2) 


AXIOM 


14 


0.28085 


0.6365 


10.2758 


(2) 


STEPS 


16 


0.21363 


0.3663 


5.7383 


(2) 


R INF 


17 


0.01678 


0.1596 


0.0338 


(2) 


AV TH 


18 


-0. 12131 


0.2785 


1.7924 


(2) 


AV AX 


19 


0.12385 


0.091 3 


1.8692 


(2) 


PSLI 


21 


0.01364 


0.8805 


0.0223 


(2) 


POSIT 


22 


-0.05951 


0.0397 


0.4264 


(2) 
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TABLE 5»10 

SUMMARY TABLE FOR C4 ON THE FULL SET OF PROBLEMS 



STEP VARIABLE MULTIPLE 



NUM 


ENT REM 


R 


RSQ 


1 


TOT R 


20 


0,62550 


0,391 25 


2 


RE 


11 


0,74530 


0,55547 


3 


THERM 


15 


0.80200 


0,64320 


4 


AXIOM 


14 


0,81930 


0.67125 


5 


WORDS 


6 


0,83730 


0,70107 


6 


STEPS 


16 


0.84040 


0.70627 


7 


PAREN 


9 


0,84320 


0,71099 


8 


LOGCN 


8 


0,84710 


0,71758 


9 


TOT R 


20 


0,84710 


0,71758 


10 


SYMBL 


7 


0,851 10 


0.72437 


11 


AV TH 


18 


0.85320 


0,72795 


12 


R INF 


17 


0,85490 


0,73085 


13 


POSIT 


22 


0.85540 


0,73171 


14 


AV AX 


19 


0.85600 


0,73274 


15 


PSLI 


21 


0.85790 


0,73599 


16 


AV RE 


13 


0,86010 


0,73977 


17 


CP 


12 


0,86050 


0,74046 


18 


PREMS 


10 


0,86060 


0.74063 





r VALuL 


LAST REG 




rUK UhL 


^^^^^^ T 

COEFFICNTS 


0,39125 


79,0482 


0.89661 


0.16422 


45,0829 


1 ,01207 


0.08773 


29.7202 


4,28564 


0,02805 


10.2758 


1,25623 


0,02982 


11 ,8369 


0,06957 


0.00520 


2.0728 


0,29581 


0.00471 


1.9620 


-1 ,19005 


0.00659 


2.6536 


-1,63819 


0,00000 


0,0023 




0,00679 


2,8960 


0,10362 


0,00358 


1,5032 


0,78169 


0.00290 


1 .2176 


1,80194 


0,00086 


0,3643 


-0.20106 


0.00103 


0.4798 


1 .14620 


0,00326 


1 ,2987 


0.19715 


0,00378 


1 ,6326 


1 .94633 


0.00069 


0,2912 


0.53331 


0.00017 


0.0415 


-0.05619 
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TABLE 5.11A 



STEP NUMBER 1 FOR C5 



VARIABLE ENTERED 15 
MULTIPLE R 0.6799 
STD. ERROR OF EST. 2.1112 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 123 



SUM OF 



SQUARES 
471.399 
548.249 



MEAN SQUARE 
471.399 
4.457 



F-RATIO 
105.759 



VARIABLES IN 
VARIABLE 
THERM 15 



EQUATION : 

COEFFICIENT 
3.72353 



(CONSTANTS 2.15129 ) 

STD. ERROR F TO REMOVE 
0.36207 105. 7586 (2) 



VARIABLES NOT IN EQUATIOl : 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


CLAS1 


1 


0.70420 


0.8942 


CLAS2 


2 


0.69615 


0.9068 


CLAS3 


3 


0.77228 


0.8689 


CLAS4 


4 


0.93227 


0.6886 


WORDS 


6 


0.27808 


0.9731 


SYMBL 


7 


0.13139 


0.9785 


LOGCN 


8 


-0.08534 


0.9513 


PAREN 


9 


0.09174 


0.9969 


PREMS 


10 


-0.13052 


0.9601 


RE 


11 


0.54088 


0.9943 


CP 


12 


-0.05913 


0.95j6 


AV RE 


13 


0.42484 


0.8952 


AXIOM 


14 


0.46331 


0.9995 


STEPS 


16 


0.46577 


0.9998 


R INF 


17 


0.43381 


0.8634 


AV TH 


18 


0.18192 


0.4231 


AV AX 


19 


0.40594 


0.6810 


TOT R 


20 


0.44211 


0.6166 


PSLI 


21 


-0.00168 


0.9150 


POSIT 


22 


0.45377 


0.7060 



F TO ENTER 
120.0150 (1) 
114.7198 (1) 
180.2896 (1) 
810.2264 (1) 
10.2250 (2) 
2.1433 (2) 
0.8951 (2) 
1.0354 (2) 
2.1142 (2) 
, 50.4507 (2) 
0.4281 (2) 
26.8693 (2) 
33.3454 (2) 
33.7991 (2) 
28.2824 (2) 
4.1758 (2) 
24.0704 (2) 
29.6405 (2) 
0.0003 (2) 
31.6352 (2) 
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TABLE 5,1 1B 



STEP NUMBER 2 FOR C5 



VARIABLE ENTERED 11 
MULTIPLE R 0.7872 
STD. ERROR OF EST. 1.7830 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES 
REGRESSION 2 631.790 

RESIDUAL 122 387.858 



MEAN SQUARE 
315.895 
3.179 



F-RATIO 
99.364 



VARIABLES IN 
VARIABLE 



EQUATION: 

COEFFICIENT 



(CONSTANTS 
STD. 



1.41221 ) 

ERROR F TO REMOVE 



RE 


11 


1.03769 


0.14610 


50.4507 


(2) 


THERM 


15 


3.55872 


0.30666 


134.6666 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.53615 


0.3716 


48.8150 


(1) 


CLAS2 


2 


0.53160 


0.2593 


47.6637 


(1) 


CLAS3 


3 


0.67509 


0.3058 


101 .3209 


(1) 


CLAS4 


4 


0.91176 


0.3831 


596.3046 


(1) 


WORDS 


6 


0.28717 


0.9685 


10.8751 


(2) 


SYMBL 


7 


0.17151 


0.9780 


3.6673 


(2) 


LOGCN 


8 


-0.11727 


0.9507 


1.6871 


(2) 


PAREN 


9 


0.08390 


0.9953 


0.8577 


(2) 


PREMS 


10 


-0.24507 


0.9423 


7.7314 


(2) 


CP 


12 


-0.09717 


0.9539 


1.1535 


(2) 


AV RE 


13 


0.16089 


0.5920 


3.2154 


(2) 


AXIOM 


14 


0.41875 


0.9504 


25.7290 


(2) 


STEPS 


16 


0.07249 


0.3721 


0.6392 


(2) 


R INF 


17 


0.25235 


0.6870 


8.2296 


(2) 


AV TH 


18 


0.14447 


0.4177 


2.5792 


(2) 


AV AX 


19 


0.32129 


0.6315 


13.9278 


(2) 


TOT R 


20 


0.30788 


0.5307 


12.6709 


(2) 


PSLI 


21 


-0.04050 


0.9118 


0.1988 


(2) 


POSIT 


22 


0.28077 


0.5638 


10.3548 


(2) 
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TABLE 5.11C 



STEP NUMBER 3 FOR C5 



VARIABLE ENTERED 14 
MULTIPLE R 0.8284 
STD. ERROR OF EST, 1.6258 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 3 699.801 233.267 88.246 

RESIDUAL 121 319.847 2.643 



VARIABLES IN EQUATION: ( CONSTANTS 1.14557 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 0.88414 0.13661 41.8856 (2) 

AXIOM 14 1.35888 0.26790 25.7290 (2) 

THERM 15 3.61508 0.27985 166.8704 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.44565 


0.3149 


29.7381 


(1) 


CLAS2 


2 


0.48323 


0.2422 


36.5586 


(1) 


CLAS3 


3 


0.64213 


0.2832 


84. 1 981 


(1) 


CLAS4 


4 


0.89684 


0. 3300 


493.2515 


(1) 


WORDS 


6 


0.31066 


0.9684 


12.8181 


(2) 


SYMBL 


7 


0.20321 


0.9770 


5.1688 


(2) 


LOGON 


8 


-0.03412 


0.9097 


0.1398 


(2) 


PAREN 


9 


0.06757 


0.9924 


0.5504 


(2) 


PREMS 


10 


-0.14585 


0.8675 


2.6080 


(2) 


CP 


12 


-0.02018 


0.9198 


0.0489 


(2) 


AV RE 


13 


-0.00619 


0.4989 


0.0046 


(2) 


STEPS 


16 


0.08569 


0.3720 


0.«-;077 


(2) 


R INF 


17 


0.08151 


0.5514 


0.8027 


(2) 


AV TH 


18 


0.04849 


0.3931 


0.2829 


(2) 


AV AX 


19 


0.07521 


0.3711 


0.6827 


(2) 


TOT R 


20 


0.09045 


0.3554 


0.9899 


(2) 


PSLI 


21 


0.06441 


0.8625 


0.5000 


(2) 


POSIT 


22 


0.08181 


0.4121 


0.8086 


(2) 
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TABLE 5. 1 2 



SUMMARY TABLE FOR C5 ON THE FULL SET OF PROBLEMS 



STEP VARIABLE MULTIPLE INCREASE F VALUE LAST REG 

NUM ENT REM R RSQ IN RSQ FOR DEL COEFFICNTS 



1 


THERM 


15 


0.67990 


0.46226 


0.46226 


105.7586 


4.05968 


2 


RE 


11 


0,78720 


0.61968 


0.15742 


50.4507 


0.67997 


3 


AXIOM 


14 


0.82840 


0.68625 


0.06656 


25.7290 


1.25372 


4 


WORDS 


6 


0.84650 


0.71656 


0. 03032 


12.8181 


0.07214 


5 


PAREN 


9 


0.85090 


0.72403 


0.00747 


3.1787 


-0.87724 


6 


LOGCN 


8 


0.85280 


0.72727 


0.00324 


1 .4294 


-1.10059 


7 


SYMBL 


7 


0.85450 


0.73017 


0.00290 


1 .2940 


0.0647b 


8 


STEPS 


16 


0.85610 


0.73291 


0.00274 


1.1375 


0.11336 




) PREMS 


10 


0.85680 


0.73411 


0.00120 


0.5597 


-0.11963 


10 AV TH 


18 


0.85720 


0.73479 


0.00069 


0.2695 


-0.10586 


11 


R INF 


17 


0.85740 


0.73513 


0.00034 


0. 1253 


0.32683 


12 


PSLI 


21 


0.85750 


0.73531 


0.00017 


0.091 0 


0.11460 


13 


POSIT 


22 


0.85770 


0.73565 


0.00034 


0.1139 


-0.10662 


14 


TOT R 


20 


0.85810 


0.73634 


0. 00069 


0.3230 


0.58693 


15 


AV RE 


13 


0.85930 


0.73840 


0.00206 


0.8699 


1.19229 


16 


CP 


12 


0.85940 


0.73857 


0.00017 


0.0841 


0.27553 
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FIGURE 5.9 C5 VS S22( POSIT) 
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FIGURE 5.10 RESIDUALS (Y-AXIS) VS COMPUTED C5(X-AXIS) 
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FIGURE 5.11A Ci VS SI 5 (THERM) 
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FIGURE 5.11B C2 VS S1 5 (THERM) 
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FIGURE 5.11C C3 VS S1 5 (THERM) 
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FIGURE 5.11D C4 VS S15(THERH) 
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FIGURE 5.11E C5 VS SI 5 (THERM) 
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FIGURE 5.12A £1 VS S11 (RE) 
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FIGURE 5.12B C2 VS S11 (RE) 
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FIGURE S.12C C3 VS S11 (RE) 
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FIGURE 5.1 2D C4 VS S11 (RE) 
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FIGURE 5. 12E C5 VS S11 (RE) 
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CHAPTEK SIX 

6^1 z INTRODUCTION 

In this chapter, the results oi regression analyses 
over the restricted set of problems, wnich was discussed in 
Chapter IV, is be examined. This restricted set consists 
of the 54 problems that appear after the introduction of 
Replace Equals emd do not have any premises. As in the 
previous chapter, a separate analysis was run for each of 
the five partitions. The results of these analyses do not 
differ sharply from those discussed in chapter five, and 
the discussion here will be drief . For the sake of 
completeness, however, a full set of tables of results is 
included. 

Table 6.1 lists the means and standard deviations for 
all 22 variables, using the restricted set of problems, and 
Table 6.2 is the correlation matrix for these variables. A 
plot of the correlations of S11 (RE) , S16(STEPS) , and 
S15(THERM) with the dependent variable against the ordinal 
number of the dependent variable for the restricted set of 
problems is found in Figure 6.1. The pattern in Figure 6.1 
is very similar to that in Figure 5.1. 

6.2 z REGRESSION WITH CI AS THE DEPENDENT VARIABLE 

For the first regression analysis, CI is again the 
dependent variable. The results for the first three 
variables to enter the regression equation and a summary 
for the complete analysis are found in Tables 6.3A,B,C,D. 

Together, the first four variables to enter the 
equation account for over 70 percent of the total variance 
in the dependent variable. This is somewhat less than the 
77 percent that was accounted for by the first four 
variables when the full set of problems was used, but the 
fit is still quite good. Since the predictive power of 
several of the independent variables (SII(RE) and 
S 22 (POSIT) for example) was enhanced by the inclusion of 
the first fifty problems, the slight decrease in the 
variance accounted for by the regression equation is not 
surprising. 

The first variable to enter the equation is 
S16(STEPS)# Using the full set of problems and CI as the 
dependent variable, S11 (re) was the first variable to 
enter. It has already been observed that S11 cind SI 6 serve 
very similar function as measures of relatively 
superficial structural complexity. In the analysis of the 
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first set of problems, the predictive power of S11 was 
enhanced by the fact that the first 53 problems had 
uniformly low values for the dependent variable and had 
value zero for S11, It is not surprising then that 
S16(STEPS) replaces S11(RE) as the first variable to enter 
the equation • 

The second variable entering the equation, using the 
restricted set, is S1 5 (THERM) . Using the full set , 
S22(POSIT) was the second variable to enter the equation. 
The overall predictive power of S2 2 was also enhanced by 
the inclusion of the first 53 problems in the an€dysis. 
With this effect eliminated, S15 is prominent even for the 
first partition. The relative predictive power of S15 is 
greater for the restricted set of problems, because the 
percentage of problems with values of S15 greater than zero 
is much larger than it was for the full set. 

The third and fourth variables to enter are S14(AXI0M) 
and S7(SYMBL). These are the same variables that entered 
as the third and fourth variables for the full set of 
problems. 

At this point it is appropriate to discuss the 
assumptions in the model for regression, specifically 
normality of the distibution of errors suid homogeneity of 
their variance. 

Figure 6. 2a contains a histogram for the residuals 
after all of the variables have entered the equation. The 
distribution does not indicate any serious violations of 
the normality asumption. A plot of the residuals against 
the computed value of C1 is found in Figure 6.2B. From 
this plot, it appears that the homogeniety-of- variance 
assumption is not seriously violated for the restricted set 
of problems. The highly significant values for the 
F-ratios in this analysis and in the other four analyses 
presented in this chapter provide reassurance that the 
results obtained are not due to chance. 

The equation as a whole is significant at the .01 
level for all fourteen steps In the stepwise regression 
analysis presented • The F-ratios for adding each of the 
first four variables in the equation are also significant 
at the .01 level. Lastly, in Table 6. 3D, it can be seen 
that the F-ratios for deleting any of the first four 
variables are also significant. 

The question of statistical significance hats been 
discussed only very briefly here, and will not be discussed 
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for the other analyses in this chapter. The reason for 
this omission has already been explained, 

6j|_3 z REGRESSION WITH C2 AS THE DEPENDENT VARIABLE 

In the regression analysis for the second dependent 
variable, S1 6 ( STEPS ) and S 1 5 (THERM ) are the first two 
variables to enter; again the patterns ^ound in the tables 
of partial correlations (Table 6.4A,B), for this analysis, 
are generally similar to those for corresponding analysis 
of the full se t of probl ems ( Tabl e s 5 . 5A , B ) . The 
differences that do appear in these tables are principally 
due to the diminished predictive value of S11(RE) and the 
rule-position variables. 

The third variable to enter the equation is S12(CP), 

one of the problem-structure variables (Table 6 .40) . A 

summary of the results for all variables in this analysis 
is included as Table 6.4D. Using the full set of problems 

and the second partition, the regression equation accounted 

for 82 percent of the veuriance in the dependent variable; 

with the restricted set used here, the regression equation 
accounts for 79 percent of the variance* 

Figure 6.3A contains a histogram of the residuals for 

this analysis and Figure 6.3B contains a plot of the 

residuals against the predicted vedue of C2. Neither of 

these figures indicates a serious violation of the 
assumptions. 

6*4 - DISCUSSION 

The results for the regression analyses using C3, C4, 
and C5 as depend^t variables follow the pattern 
established in the last chapter and will not be discussed 
in detail here. For completeness, the results are included 
in Tables 6.5, 6*6 and 6.7, and Figures 6.4, 6.5, and 6.6* 
There are no serious violations of the assumptions in any 
of these analyses* 

The evidence for the restricted set of problems tends 
to confirm the general conclusions indicated by the 
analysis of the full set. There are two types of variation 
that appear in the sample of proofs. The first type of 
variation consists of the relatively superficial 
differences that appear in the proofs* Variation in the 
order in which rules are used in proofs is one example of 
this type of difference. The definitions of equivalence 
for the first two partitions are very sensitive to changes 
in order. The definition for the third partition is 
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sensitive to some differences in order but not to all; the 
last two partitions completely ignore difference in order. 

Both S11(RE) and S16(STEPS) are good pr-:.xCtors of the 
dependeit variables, C1 and C2, for the first two 
partitions; their importance systematic^ily declines for 
the last three partitions C3, C4, and C5. It appears that 
both of these variables are good predictors of sources of 
variation such as changes in the order in which rules are 
used, but are relatively ineffective in predicting more 
significant sources of veuriation such as differences in the 
rules used in a proof. 

The second principal source of variation found in this 
study is in the rules used to form the proofs. This type 
of variation is much more fundamental and important. It is 
predicted best by S1 5 (THERM) and to a lesser degree by 
S14(AXI0M). All five of the partitions are sensitive to 
differences in the rules used in a proof. The relative 
importance of the set of rules used increases as we move 
from the first partition to the fifth, because other types 
of difference are successively being eliminated from 
consideration 3s we move from one partition to the next. 
The fifth partition is defined only by the particular rules 
used in the proofs. So, it is not surprising that the 
importance of rule-position variables increases from 
parition to partition* These observations will be 
developed in Chapters VII and VIII. 
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TABLE 6. 1 



M EANS AND STANDARD DEVIATIONS FOR RESTRICTED SET 



VARIABLE 



MEAN 



STANDARD DEVIATION 



♦ - 

!. 



i 



CLAS1 


1 


13.42593 


6.40048 


CLAS2 


2 


9.31481 


6.51797 


CLAS3 


3 


7.92593 


5.63966 


CLAS4 


4 


5.37037 


3.66161 


CLASS 


5 


4.53704 


2.96974 


WORDS 


6 


15. 12963 


6.5532b 


SYMBL 


7 


12.38889 


6.02954 


LOGON 


8 


0.20370 


0.40653 


PAREN 


9 


1.03704 


0.91038 


PREMS 


10 


0.00000 


0.00000 


RE 


11 


1. 16667 


1.17762 


CP 


12 


0.18519 


0.39210 


AV RE 


13 


1.00000 


0.00000 


AXIOM 


14 


0.55556 


0.69137 


THERM 


15 


0.42593 


0.68960 


STEPS 


16 


4.66667 


3.15032 


R INF 


17 


18.74074 


0.55577 


AV TH 


18 


1.40741 


1 .95727 


AV AX 


19 


3.55556 


1.72295 


TOT R 


20 


23.70370 


3.66848 


PSLI 


21 


5.00000 


3.49123 


POSIT 


22 


95.92593 


19.84834 
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TABLE 6.2 



CORRELATION MATRIX FOR RESTRICTED SET 



VARIABLE 

NUMBER 1 2 3 4 5 

1 1.000 0.934 0.914 0.739 0.588 

2 1.000 0.972 0.747 0.556 

3 1.000 0.824 0.654 

4 1.000 0.913 

5 1 . 000 



MATRIX CONTINUED 
VARIABLE 

NUMBER 6 7 8 9 10 

1 0.462 0.387 -0.019 0.324 0.000 

2 0.402 0.322 -0.018 0.275 0.000 

3 0.413 0.347 -0.100 0.284 0.000 

4 0.239 0.117 -0.140 0.098 0.000 

5 0.127 0.004 -0.249 0.055 0.000 

6 1.000 0.626 0.025 0.429 0.000 

7 1.000 0.190 0.747 0.000 

8 1.000 -0.174 0.000 

9 1.000 0.000 
10 0.000 



MATRIX CONTINUED 
VARIABLE 

NUMBER 11 12 13 14 15 

1 0.726 -0.077 0.000 0.116 0.035 

2 0.804 -0.097 0.000 0.015 0.028 

3 0.758 -0.147 0.000 0.040 0.096 

4 0.467 -0.128 0.000 0.066 0.370 

5 0.265 -0.217 0.000 0.036 0.559 

6 0.381 0.049 0.000 0.142 -0.305 

7 0.254 0.208 0.000 0.060 -0.236 

8 0.125 0.943 0.000 -0.209 -0.315 

9 0.223 -0.125 0.000 0.027 0.004 
10 0.000 0.000 0.000 0.000 0.000 
1 1 1.000 0.095 0.000 -0.023 -0.205 

12 1.000 0.000 -0.178 -0.297 

13 0.000 0.000 0.000 

14 1.000 -0.347 

15 1.000 
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TABLE 6^2 CONTINU ED 



MATRIX CONTINUED 
VARIABLE 



MTTMPPD 


1 f\ 

1 o 


4 

1 


U • / JO 


o 
c 




3 


0.727 


4 


0.402 


5 


0.151 


6 


0.444 


7 


0.311 


8 


0.275 


9 


0.221 


10 


0.000 


1 1 


0.926 


12 


0.204 


13 


0.000 


1 4 


-0.009 


1 5 


-0.290 


16 


1.000 


17 




18 




1 9 




20 





1 / 


1 Q 
1 O 


0.122 


0.016 


0.148 


-0.027 


0.210 


0.023 


0.354 


0.297 


0.349 


0.491 


-0.198 


-0.285 


-0. 330 


-0.423 


-0.430 


-0.296 


-0. 130 


-0.273 


0.000 


0.000 


-0.019 


-0.11 2 


-0.468 


-0.272 


0.000 


0.000 


0.186 


-0.087 


0.294 


0.736 


-0.061 


-0.201 


1 .000 


0.342 




1 .000 



1 Q 








— n r\r\ 
— U. UUO 


U • UUO 


0.084 


0.084 


0.311 


0.358 


0.427 


0.515 


-0.112 


-0.235 


-0,339 


-0.435 


-0.488 


-0.452 


-0.098 


-0.211 


0.000 


0.000 


-0.167 


-0.141 


-0.462 


-0.433 


0.000 


0. 000 


0.243 


0.096 


0.528 


0.685 


-0.229 


-0.224 


0.764 


0.693 


0.614 


0.874 


1 .000 


0.913 




1 .000 



MATRIX CONTINUED 
VARIABLE 



NUMBER 


21 


22 


1 


0.192 


0.073 


2 


0.159 


0.025 


3 


0.106 


0.101 


4 


-0.028 


0,365 


5 


-0.111 


0.521 


6 


0.319 


-0.193 


7 


0.187 


-0.418 


8 


0.306 


-0.447 


9 


0.059 


-0.201 


10 


0.000 


0.000 


11 


0.110 


-0.139 


12 


0.221 


-0.439 


13 


0.000 


0.000 


14 


-0.102 


0.084 


15 


-0.384 


0.667 


16 


0.259 


-0.211 


17 


-0.224 


0.710 


18 


-0.353 


0.853 


19 


-0.420 


0.910 


20 


-0.420 


0.990 


21 


1.000 


-0.310 


22 




1 . 000 
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FIGURE 6.1 



CORRELAriONS BETWEEN DEPENDENT AND INDEPENDENT VARIABLES 



AGAINST THE ORDINAL NUMBER OF THE DEPENDENT VARIABLE 



1.00 



72 



.46 



24 



.00 




.000 



-I 1- 

2.400 3.600 



I 1: 

4. BOO 6.000 



(1) SII(RE) 

(2) S16(STEPS) 
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TABLE 6. 3A 



STEP NUMBER 1 FOR C1 



VARIABLE ENTERED 16 
MULTIPLE R 0.7361 
STD. ERROR OF EST. 4.3736 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 52 



SUM OF SQUARES 
1176.510 
994.693 



MEAN SQUARE 
1176.510 
19. 129 



f-RATIO 
61 . 505 



VARIABLES IN 
VARIABLE 



EQUATION : 

COEFFICIENT 



(CONSTANTS 
STD. 



6.44663 ) 
ERROR F TO REMOVE 



STEPS 


16 


1.49556 


0.19070 


61 .5049 


(2) 


VARIABLE 


5 NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS2 


2 


0.85137 


0.3466 


134.3401 


(1 ) 


CLAS3 


3 


0.81412 


0.4713 


100.2426 


(1 ) 


CLAS4 


4 


0.71580 


0.8385 


53.5881 


(1 ) 


CLAS5 


5 


0.71352 


0.9773 


52.B930 


(1 ) 


WORDS 


6 


0.22308 


0.8024 


2.6/08 


(2) 


SYMBL 


7 


0.24619 


0.9033 


3.2906 


(2) 


LOGCN 


8 


-0.34100 


0.9244 


6.7106 


(2) 


PAREN 


9 


0.24430 


0.9509 


3.2370 


(2) 


PREMS 


10 


0.00000 


1 . 0000 


0.0000 


(2) 


RE 


1 1 


0.17565 


0.1432 


1 . 6237 


(2) 


CP 


12 


-0. 34264 


0.9585 


6. 7838 


(2) 


AV RE 


13 


0.00000 


1 . 0000 


0. 0000 


(2) 


AXIOM 


14 


0. 18092 


0.9999 


1 .7257 


(2) 


THERM 


15 


0.38307 


0.9162 


8. 7709 


(2) 


R INF 


17 


0.24683 


0.9963 


3.3086 


(2) 


AV TH 


18 


0.24723 


0.9596 


3.3202 


(2) 


AV AX 


19 


0.32965 


0.9474 


6.2177 


(2) 


TOT R 


20 


0.32552 


0.9497 


6.0446 


(2) 


PSLI 


21 


0.00151 


0.9329 


0.0001 


(2) 


POSIT 


22 


0.34553 


0.9553 


6.9142 


(2) 
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TABLE 6.3B 



STEP NUMBER 2 FOR CI 



VARIABLE ENTERED 15 
MULTIPLE R 0.7804 
STD. ERROR OF EST. 4.0794 



ANALYSIS OF VARIANCE: 

DF SUM OF SOUARES MEAN SQUARE F-RATIO 

REGRESSION 2 1322.473 661.237 39.734 
RESIDUAL 51 848.730 16.642 



VARIABLES IN EQUATION: 

VARIABLE COEFFICIENT 
THERM 15 2.51418 
STEPS 16 1.65489 



(CONSTANTS 4.63224 ) 

STD. ERROR F TO REMOVE 
0.84894 8.7709 (2) 

0.18583 79.3066 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS2 


2 


0.82327 


0.2715 


105.1697 ( 


1) 


CLAS3 


3 


0.77770 


0.3691 


76.5257 ( 


1) 


CLAS4 


4 


0.65486 


0.5807 


37.5408 


(1) 


CLAS5 


5 


0.65929 


0.5813 


38.4422 ( 


1) 


WORDS 


6 


0.33375 


0.7686 


6.2674 ( 


2) 


SYMBL 


7 


0.33730 


0.8802 


6.4188 ( 


2) 


LOGCN 


8 


-0.27201 


0.8637 


3.9952 ( 


2) 


PARBN 


9 


0.23464 


0.9458 


2.9133 ( 


2) 


PREMS 


10 


0.00000 


1.0000 


0.0000 ( 


2) 


RE 


11 


0.12015 


0.1389 


0.7324 ( 


2) 


CP 


12 


-0.27452 


0.8966 


4.0751 ( 


2) 


AV RE 


13 


0.00000 


1 .0000 


0.0000 1 


2) 


AXIOM 


14 


0.37329 


0.8663 


8.0952 1 


2) 


R INF 


17 


0. 15402 


0.9132 


1.2149 1 


2) 


AV TH 


18 


-0.04632 


0.4586 


0.1075 ( 


2) 


AV AX 


19 


0,17447 


0.7152 


1.5697 1 


2) 


TOT R 


20 


0. 10280 


0.5303 


0.5340 1 


2) 


PSLI 


21 


0. 14881 


0.8287 


1.1323 1 


2) 


POSIT 


22 


0.13856 


0.5550 


0.9788 1 


2) 
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TABLE 6.3C 



STEP NUMBER 3 FOR CI 



VARIABLE ENTERED 14 
MULTIPLE R 0.8146 
STD. ERROR OF EST. 3.8222 



ANALYSIS OF VARIANCE! 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 
REGRESSION 3 1440.739 480.246 32.873 

RESIDUAL 50 730.465 14.609 



VARIABLES IN EQUATION: (CONSTANT^ 2.68057 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

AXIOM 14 2.32139 0.81589 8.0952 (2) 

THERM 15 3.40299 0.85455 15.8578 (2) 

STEPS 16 1.71563 0.17542 95.6551 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS2 


2 


0.81 228 


0.2543 


95.0327 ( 


1 


CLAS3 


3 


0.75434 


0.3384 


64.6947 ( 


1 


CLAS4 


4 


0.60134 


0.5054 


27.7551 1 


1 


CLASS 


5 


0.60360 


0.4990 


28.0841 1 


1 


WORDS 


6 


0.32246 


0.7615 


5.6861 ( 


2 


SYMBL 


7 


0.36018 


0.8801 


7.3045 ( 


2 


LOGON 


8 


-0.16531 


0.7623 


1.3766 ( 


2 


PAREN 


9 


0.22901 


0.9424 


2.7120 ( 


2 


PREMS 


10 


0.00000 


1 . 0000 


O.OCOO ( 


2 


RE 


11 


0.1 1934 


0.1388 


0.7079 ( 


2 


CP 


12 


-0.18254 


0.8142 


1.6891 1 


2 


AV RE 


13 


0.00000 


1 . 0000 


O.QOOO ( 


2 


R INF 


17 


0.03668 


0.8159 


0.0660 1 


2 


AV TH 


18 


-0.16474 


0.4251 


1.3669 1 


2 


AV AX 


19 


-0.02959 


0.5145 


0.0429 1 


2 


TOT R 


20 


-0.09793 


0.4041 


0.4745 1 


2 


PSLI 


21 


0.27278 


0.7741 


3.9391 ( 


2 


POSIT 


22 


-0.03659 


0.4415 


0.0657 ( 


2 
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TABLE 6. 3D 



SUMMARY TABLE FOR CI: 



STEP VARIABLE 


MULTIPLE 


INCREASE 


F VALUE 


LAST REG 


NUN ENT REM 


R 


RSQ 


IN RSQ 


FOR DEL 


COEFFICNTS 


1 


STEPS 


16 


0.73610 


0.54184 


0.54184 


61.5049 


0.75218 


2 


THERM 


15 


0.78040 


0.60902 


0.06718 


8.7709 


4.44347 


3 


AXIOM 


14 


0.81460 


0.66357 


0.05455 


8.0952 


2.15535 


4 


SYMBL 


7 


0.84100 


0.70728 


0.04371 


7.3045 


0.41046 


5 


PSLI 


21 


0.85310 


0.72778 


0.02050 


3.6330 


0.56270 


6 


RE 


11 


0.86410 


0.74667 


0.01889 


3.4873 


2.20050 


7 


CP 


12 


0.87090 


0.75847 


0.01180 


2.2580 


-2.99584 


8 


PAREN 


9 


0. 87700 


0.76913 


0.01066 


2.0799 


-1 .37648 


9 


R INF 


17 


0.87890 


0.77247 


0.00334 


0.6434 


1.90201 


10 


AV TH 


18 


0.87960 


0.77370 


0.00123 


0.2448 


0.92563 


11 


AV AX 


19 


0.88010 


0.77458 


0.00088 


0.1563 


1 .51330 


12 


POSIT 


22 


0.88140 


0.77687 


0.00229 


0.4215 


-0.22560 


13 


WORDS 


6 


0.88160 


0.77722 


0.00035 


0.0474 


0.03198 


14 


LOGON 


8 


0.88160 


0.77722 


0.00000 


0.0183 


0.61787 



!il8 



I 
I 



1 

1 


RANGE 


0 

I-I-I-I- 




-10.000- 


I 




-9.001 


I 


1 


-9.000- 


I 




-8.001 


I 




-8.000- 


I 


i 


-7.001 


I 


i . 


-7. 000- 






-6.001 






-6.000- 


I* 


! 


-5.001 


I* 




-5,000- 


j»» 




-4,001 


I»» 


i 
j 


-4, 000- 




i 


-3,001 


J**** 




-3, 000- 






-2.001 




) 


-2.000- 


J**** 




-1.001 






-1.000- 




1 


-.001 






.000- 






.999 




/ 
I 


1 .000- 




1 


1.999 






2.000- 






2.999 




i. 


3.000- 






3.999 






4.000- 






4.999 




I 


5.000- 


I»» 




5.999 


I»* 




6.000- 


I 
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FIGURE 6.2A - RESIDUALS FOR CI ON RESTRICTED SET 
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FIGURE 6.2B RESIDUALS (Y-AXIS ) VS COMPUTED CI, (X-A/v^l .' 
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TABLE b.4A 



STEP NUMBER 1 FOR C2 



VARIABLE ENTERED 16 
MULTIPLE R 0.8083 
STD. ERROR OF EST. 3.8743 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 52 



SUM OF SQUARES 
1471 .128 
780.520 



MEAN SQUAKt; 
1471 . 1 28 
15.010 



f-RATiO 
98.010 



VARIABLES IN 


EQUATION: (CONSTANT= 1.51042 ) 


VARIABLE 


COEFFICIENT 


STD. ERROR 


F TO REMOVE 


STEPS 


16 


1.67237 


0. 16893 


98.009':> (2) 


VARIABLES NOT 


IN EQUATION: 






VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.85137 


0.4581 


134.3401 (1 ) 


CLAS3 


3 


0.95144 


0.4713 


487.1914 (1) 


CLAS4 


4 


0.78282 


0.8385 


80.7189 (1) 


CLASS 


5 


0.74691 


0.9773 


64.3511 (1) 


WORDS 


6 


0.08080 


0.8024 


0.3351 (?.) 


SYMBL 


7 


0.12606 


0.9033 


0.8236 (2) 


LOGCN 


8 


-0.42367 


0.9244 


11.1571 (2) 


PAREN 


9 


0.16652 


0.9509 


1 .4545 (2) 


PREMS 


10 


0.00000 


1 .0000 


0.0000 (2) 


RE 


11 


0.25148 


0. 1432 


3.4432 (2) 


CP 


12 


-0.45399 


0.9585 


13.2407 (2) 


AV RE 


13 


0.00000 


1 . 0000 


0.0000 (2) 


AXIOM 


14 


0.03718 


0.9999 


0.0706 (2) 


THERM 


15 


0.46558 


0.9162 


14.1148 (2) 


R INF 


17 


0.33577 


0.9963 


6.4806 (2) 


AV TH 


18 


0.23564 


0.9596 


2.9984 (2) 


AV AX 


19 


0.31351 


0.9474 


5.5590 (2) 


TOT R 


20 


0.32554 


0.9497 


6.0453 (2) 


PSLI 


21 


-0.08825 


0.9329 


0.4003 (2) 


POSIT 


22 


0.33972 


0.9553 


6.6540 (2) 



117 



TABLE 6.4B 



STEP NUMMBER 2 FOR C2 



VARIABLE ENTERED 15 
MULTIPLE R 0.8535 
STD. ERROR OF EST. 3.4622 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 2 1640.320 820.160 68.422 

RESIDUAL 51 611.328 11 .987 



VARIABLES IN EQUATION; 

VARIABLE COEFFICIENT 
THERM 15 2.70686 
STEPS 16 1.84391 



(CONSTANT* -0.44301 ) 

STD. ERROR F TO REMOVE 
0.72049 14.1148 (2) 

0.15771 136.6924 (2) 



VARIABLES NOT IN EQUATION: 



VARIABLE 


PARTIAL CORR 


CLAS1 


1 


0.82327 


CLAS3 


3 


0.93799 


CLAS4 


4 


0.71237 


CLAS5 


5 


0.66009 


WORDS 


6 


0.20364 


SYMBL 


7 


0.22967 


LOGCN 


8 


-0.35585 


PAREN 


9 


0.14991 


PREMS 


10 


0.00000 


RE 


11 


0.19601 


CP 


12 


-0.39212 


AV RE 


13 


0.00000 


AXIOM 


14 


0.251 75 


R INF 


17 


0.23761 


AV TH 


18 


-0.16474 


AV AX 


19 


0.10800 


TOT R 


20 


0.02441 


PSLI 


21 


0.08077 


POSIT 


22 


0.05684 



TOLERANCE 


F TO ENTER 


0.3909 


105.1697 


(1) 


0.3691 


366.0775 


(1) 


0.5807 


51 .5183 


(1) 


0.5813 


38.6080 


(1) 


0.7686 


2.1631 


(2) 


0. ^802 


2.7842 


(2) 


0.8637 


7.2493 


(2) 


0.9458 


1 . 1 495 


(2) 


1 .0000 


0.0000 


(2) 


0.1389 


1.9977 


(2) 


0.8966 


9.0847 


(2) 


1 .0000 


0.0000 


(2) 


0.8663 


3.3832 


(2) 


0.9132 


2.9919 


(2) 


0.4586 


1.3949 


(2) 


0.7152 


0.5900 


(2) 


0.5303 


0.0298 


(2) 


0.8287 


0.3283 


(2) 


0.5550 


0.1621 


(2) 
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TABLE 6.4C 



STEP NUMBER 3 FOR C2 



VARIABLE ENTERED 12 
MULTIPLE R 0.8776 
STD. ERROR OF EST. 3.2166 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 3 
RESIDUAL 50 



SUM OF SQUARES 
1734.316 
517.332 



MEAN SQUARE 
578.105 
10.347 



F-RATIO 
55.874 



VARIABLES IN EQUATION: 



VARIABLE 
CP 12 
THERM 15 
STEPS 1 6 



COEFFICIENT 
-3.58702 
2.17648 
1.90122 



(CONSTANTS 0.17969 ) 

STD. ERROR F TO REMOVE 
1.19009 9.0847 (2) 

0.69213 9.8886 (2) 

0.14775 165.5704 (2) 



VARIABLES NOT IN EQUATION: 





VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


i 


CLAS1 


1 


0.80901 


0.3614 


92.8201 (1) 


J 


CLAS3 


3 


0.92749 


0.3171 


301.5748 (1) 




CLAS4 


4 


0.73022 


0.5730 


55.9761 (1) 




CLASS 


5 


0.66932 


0.5721 


39.7655 (1) 


! 


WORDS 


6 


0.17763 


0.7602 


1 .5965 (2). 




SYMBL 


7 


0.30331 


0.8674 


4.9644 (2) 




LOGON 


8 


0.03721 


0.1041 


0.0679 (2) 


i 


PAREN 


9 


0.09363 


0.9198 


0.4334 (2) 


} 

J. 


PREMS 


10 


0.00000 


1.0000 


0.0000 (2) 




RE 


11 


0.12320 


0.1323 


0.7552 (2) 




AV RE 


13 


0.00000 


1.0000 


0.0000 (2) 


1. 


AXIOM 


14 


0.15161 


0.7867 


1.1528 (2) 




R INF 


17 


0.08547 


0. 7487 


0.3606 (2) 




AV TH 


18 


-0.21655 


0.4552 


2.4108 (2) 


<^ — 

\ 


AV AX 


19 


-0.04307 


0.6177 


0.0910 (2) 




TOT R 


20 


-0.12015 


0.4731 


0.7177 (2) 




PSLI 


21 


0.13171 


0.8201 


0.8650 (2) 




POSIT 


22 


-0.08777 


0.4914 


0.3804 (2) 



1 



o 
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TABLE 6.4D 



SUMMARY TABLE FOR C2 : 



STEP VARIABLE 


MULTIPLE 


INCREASE 


F VALUE 


HUM ENT REM 


R 


RSQ 


IN RSQ 


FOR DEL 


1 


STEPS 


16 


0.80830 


0.65335 


0.65335 


98. 0099 


2 


THERM 


15 


0.85350 


0.72846 


0.07511 


14.1148 


3 


CP 


12 


0.87760 


0.77018 


0.04172 


9.0847 


4 


SYMBL 


7 


0.88960 


0.791 39 


0.02121 


4.9644 


5 


PAR£» 


9 


0.89700 


0.80461 


0.01322 


3.2381 


6 


AV TH 


18 


0.90060 


0.81108 


0.00647 


1.6413 


7 


AXIOM 


14 


0.90500 


0.81903 


0.00794 


2.0218 


8 


RE 


11 


0.90930 


0.82683 


0.00780 


1 .9887 


9 


PSLI 


21 


0.91510 


0.83741 


0.01058 


2.8831 


10 


R INF 


17 


0.91810 


0.84291 


0.00550 


1.5033 


11 


POSIT 


22 


0.91890 


0.84438 


0.00147 


0.4185 


12 


LOGCN 


8 


0.91900 


0.84456 


0.00018 


0.0423 


13 


AV AX 


19 


0.91910 


0.84474 


0.00018 


0.0444 


14 


AV TH 


18 


0.91910 


0.84474 


0.00000 


0.0001 


15 


WORDS 


6 


0.yi920 


0.84493 


0.00018 


0.0146 



LAST REG 
COEFFICNTS 

1.07026 
4.74007 

-4.37355 
0.33321 

-1.65153 

-1.65153 
1.31265 
2.06043 
0.30025 
2.27328 

-0.11116 
0.94639 
0.35906 

0.01311 



ERIC 



m 



I 

I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
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FIGURE 6.3A RESIDUALS FOR C2 ON RESTRICTED SET 



RANGE 0 10 20 30 40 bO 

I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I 
-10.000- I 
-9.001 I 
-9.000- I 
-8.001 I 
-8. 000- I 
-7.001 I 
-7.000- I 
-6.001 I 
-6.000- I** 
-5.001 I** 
-5.000- I** 
-4.001 I** 
-4.000- I* 
-3.001 I* 
-3. 000- 

-2.001 l»»»»»*» 
-2.000- 

-1.001 l»»»»»»»»» 
-1.000- 
-.001 
.000- 

.999 I********** 
1.000- I*»»»» 
1.999 l»»»»« 
2.000- I»»»» 
2.999 I*»»» 
3.000- I»»»» 
3.999 l»»»« 
4. 000- I*»* 
4.999 I*** 
5.000- I 
5.999 I 
6.000- I* 
6.999 I* 
7.000- I 
7.999 I 
8.000- I 
8.999 I 
9. 000- I 
10.000 I 

I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I 
0 10 20 30 40 50 
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FIGURE 6.3B - RESIDUALS (Y-AXIS) VS COMPUTED C2 (X-AXIS ) 



-5.71 



-4.50 



-3.28 



-2.06 



-0.84 



0.38 



1.60 



2.82 



4.04 



5.26 



ERIC 




24.919 
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TABLE 6.5A 



STEP NUMBER 1 FOR C3 



VARIABLE ENTERED 11 
MULTIPLE R 0.7576 
STD. ERROR OF EST. 3.7164 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES 

REGRESSION 1 967.498 

RESIDUAL 52 718.206 



MEAN SQUARE 
967.498 
13.812 



F-PATIO 
70.049 



VARIABLES IN EQUATION: (CONSTANT= 3.69312 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 3.62812 0.43349 70.0494 (2) 



VARIABLES NOT 
VARIABLE 



I 
I 
1 



CLAS1 
CLAS2 
CLAS4 
CLAS5 
WORDS 
SYMBL 
LOGCN 
PAREN 
PREMS 
CP 

AV RE 
AXIOM 
THERM 
STEPS 
R INF 
AV TH 
AV AX 
TOT R 
PSLI 
POSIT 



1 
2 
4 
5 
6 
7 
8 
9 
10 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 



IN EQUATION: 
PARTIAL CORR, 
0.80970 
0.93578 
0.81431 
0.71918 
0.20568 
0.24438 
-0.30083 
0.18015 
0.00000 
-0.33782 
0.00000 
0.08788 
0.39302 
0.10468 
0.34481 
0.16658 
0.32748 
0.29514 
0.03534 
0.31943 



TOLERANCE 


F TO ENTER 


0.4724 


97.0869 ( 


1) 


0.3532 


359.2754 ( 


1) 


0.7822 


100.3802 ( 


1) 


0.9296 


54.6369 ( 


1) 


0.8548 


2.2528 ( 


2) 


0.9356 


3.2392 ( 


2) 


0.9844 


5.0748 ( 


2) 


0.9503 


1.7106 ( 


2) 


1. 0000 


0.0000 1 


2) 


0.9909 


6.5699 1 


2) 


1.0000 


0.0000 1 


2) 


0.9995 


0.3969 I 


2) 


0.9579 


9.3169 1 


[2) 


0.1432 


0.5651 ( 


[2) 


0.9996 


6.8820 1 


[2) 


0.9875 


1 .4557 


;2) 


0.9720 


6.1 262 


[2) 


0.9801 


4.8664 


(2) 


0.9879 


0.0638 


(2) 


0.9806 


5.7951 


(2) 



ERIC 
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TABL E 6.5 3 



STEP NUMBER 2 FOR C3 



VARIABLE ENTERED 15 

MULTIPLE R o.7yya 

STD. ERROR OF EST. 3.4507 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE f -RATIO 

REGRESSION 2 1076.436 539.216 45.2b5 

RESIDUAL 51 607.268 11.907 



VARIABLES IN EQUATION: (CONSTANT= 2.47952 ) 

VARIABLE COEFFICIENT STD. ERROR f TO REMOVE 

RE 11 3.88574 0.41125 89.2764 (2) 

THERM 15 2.14364 0.70229 9.3169 (2) 



VARIABLES NOT 
VARIABLE 

CLAS1 1 

CLAS2 2 

CLAS4 4 

CLAS5 5 

WORDS 6 

SYMBL 7 

LOGCN B 

PA REN 9 

PREMS 10 

CP 12 

AV RE 13 

AXIOM 1 4 

STEPS 1 6 

R INF 17 

AV TK 18 

AV AX 19 

TOT R 20 

PSLI 21 

POSIT 22 



IN EQUATION: 



PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


0.79388 


0.4370 


85.2225 ( 


1) 


0.92830 


0.3142 


312.033 5 ( 


1) 


0.77776 


0.5559 


76.5551 ( 


1) 


0.66357 


0.5371 


39.3372 ( 


1) 


0.34154 


0.8013 


6.6028 ( 


2) 


0.35542 


0.9004 


7.2296 ( 


2) 


-0.20917 


0.8968 


2.2B77 ( 


2) 


0.17366 


0.9477 


1.5548 ( 


2) 


0.00000 


1.0000 


0.0000 ( 


2) 


-0.25620 


0.91 04 


3.5124 ( 


2) 


0.00000 


1.0000 


0.0000 1 


2) 


0.26730 


0.8700 


3.8474 1 


2) 


0.23743 


0.1329 


2.9872 1 


2) 


0.26014 


0.91 21 


3.6293 1 


2) 


-0.19412 


0.4571 


1.957S 1 


.2) 


0.16016 


0.7180 


1.3163 ( 


2) 


0.04303 


0. 5311 


0.0927 


[2) 


0.21244 


0.8515 


2.3631 ( 


[2) 


0.08760 


0.5554 


0.3B66 


12) 
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TABLE 6.5C 



STEP NUMBER 3 fOK C3 



VARIABLE ENTKKED 7 
MULTIPLE R 0.827B 
STD. ERROR OF ciST. 3. 2575 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 3 
RESIDUAL 50 



SUM OF SQUARES 
1155. 150 
530.554 



MEAN SQUARK 
3bb.0t>0 
1 0.611 



F-KA'i'iO 
io. 2Btt 



VARIABLES IN 
VARIABLK 
SYMBL 7 
RE 11 
THERM 1 5 



;'-:0UATI0N: (C0NSTANT= -0.00639 ) 

COEFFICIENT STD. ERROK F TO REMOVr. 

0.21028 0.07821 7. 22yb (2) 

3.6548B 0.39760 84.4980 (2) 

2.49612 0.67580 13.6423 (2) 



VARIABLES NOT IN EQUATION 



VARIABLE 


PARTIAL CORR. 


TOLERANCK 


CLAS1 


1 


0.76201 


0.3740 


CLAS2 


2 


0.92249 


0.2675 


CLAS4 


4 


0. 79091 


0.5473 


CLAS5 


5 


0.68238 


0.5338 


WORDS 


6 


0.18041 


0.5394 


LOGCN 


8 


-0.26918 


0.6850 


PAREN 


9 


-0. 15661 


0.4039 


PREKS 


10 


0.00000 


1 .0000 


CP 


12 


-0. 33260 


0.891 4 


AV RE 


13 


0.00000 


1 .0000 


AXIOM 


14 


0.28652 


0.8700 


STEPS 


16 


0. 19408 


0.1293 


R INF 


17 


0.40983 


0.8315 


AV TH 


18 


-0.05984 


0.3832 


AV AX 


19 


0.27572 


0.6727 


TOT R 


20 


0.21371 


0.4481 


PSLI 


21 


0.18915 


0.8425 


POSIT 


22 


0.25172 


0.4798 



c TO mut-.n 

67.6i19 
279.8286 
81 .8544 
42.6985 
1 .6486 
3.827b 
1 .2319 
0.0000 
6.0946 
0.0000 
4.3b2b 
1.9180 
9.8914 
0.1761 
4.0315 
2.3450 
1 .6183 
3.3145 



(1) 
(1) 
(1) 
(1) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
(2) 
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TA3LE 6.5D 



SUMMARY TABLE FOR C3: 



STEP VARIABLE MULTIPLE 



NUM 


ENT REM 


K 


KSQ 


1 


RE 


11 


0.7576U 


0.57396 


2 


THERM 


15 


0.79980 


0.63968 


3 


SYMBL 


7 


0.82760 


0.68525 


4 


R INF 


17 


0.B591U 


0. 73805 


5 


PSLI 


21 


0.86900 


0.75516 


6 


AXIOM 


14 


0.a778C 


0. 77053 


7 


PAREN 


9 


0.8B59C 


0.78482 


8 


CP 


12 


0.89790 


0.60622 


9 


AV TH 


18 


0.90500 


0.82065 


10 


STEPS 


16 


0.90970 


0.62755 


11 


LOGCN 


8 


0.91180 


0.831 36 


12 


POSIT 


22 


0.91 20C 


0.83174 


13 


AV AX 


19 


0.91 360 


0.83466 



NCREASE 


t JALUE 




IN RSG 


FOR D£L 


COtr r'iCl'-i ib 


0.57396 


70. 0494 


2.63-4SB 


0.06572 


9. 3169 


4.b23yj 


0.04557 


7.2296 


0.47907 


0.05280 


9.6914 


3.040m3 


0.01711 


3. 3242 


0.4307 7 


0.01537 


3. 1591 


1 . OoJjb 


0.01429 


3. 0652 


-2.347^-0 


0.02141 


4 . 95 1 9 


-0.Db024 


0.01443 


3. 541 b 


G.5t:327 


0.00o90 


1 .7355 


0.51-JbJ 


0.00383 


0.y394 


-2.9374i> 


0.00036 


0.0761 


-0.231 3^ 


0.00292 


0.7454 


1 .385b3 
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FIGURE 6.4A - RESIDUALS FOR C3 ON KESTRICTED SET 



RANGE 0 10 20 30 40 bO 

I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-I-^ 
-10.000- I 
-9.001 I 
-9.000- I 
-8.001 I 
-8.000- I 
-7.001 I 
-7.000- I 
-6.001 I 
-6.000- I 
-5.001 I 
-5.000- I*** 

-4.001 
-4.000- I*» 
-3.001 I»* 
-3.000- I*** 
-2.001 I*** 
-2.000- 
-1.001 
-1 . 000- 
-.001 
.000- 

.999 I********* 



1 .000- 




1 .999 




2.000- 




2.999 




3. 000- 




3.999 


J***** 


4.000- 


1* 


4.999 


I* 


5.000- 


I* 


5.999 


I* 


6.000- 


I 


6.999 


I 


7.000- 


I 


7.999 


I 


8.000- 


I 


8.999 


I 


9.000- 


I 


10.000 


I 



I_I_I_I_I-I-I-I-I-I-I~I-I-I-I-I-I-I-I-I-I-I-I-I-I-I 

0 10 20 30 40 50 
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FIGURE - RESIDUALS (Y-AXIS) VS COMPUTED C3 ( X-AXIS) 



-4.97 



-3.94 



-2.90 



-1 .87 



-0.83 



0.21 



1.24 



2.28 



3.31 



4.35 




0.057 4.387 8.716 13.045 17.374 21.703 



ERIC 
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TABLE 6.6A 



STEP NUMBER' 1 b-QR C4 



VARIABLE ENTERED 11 
MULTIPLE R 0.4667 
STD. ERROR OF EST. 3.2693 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES 

REGRESSION 1 154.800 

RESIDUAL 52 555.793 



MEAN SQUARE 
154.800 
10.688 



F-RATIO 
14.403 



VARIABLES IN 
VARIABLE 



EQUATION: 

COEFFICIENT 



(CONSTANTS 

STD. ERROR 



3.67725 ) 



F TO REMOVE 



RE 


11 


i .451 25 


0.38134 


14.4831 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.6587O 


0.4724 


39.1000 


(1) 


CLAS2 


2 


0.70676 


0.3532 


50.9010 


(1) 


CLASS 


3 


0.81431 


0.4261 


1 00.3802 


(1) 


CLASS 


5 


0.92566 


0.9296 


305.2660 


(1) 


WORDS 


6 


0.07525 


0.8548 


0.2904 


(2) 


SYMBL 


7 


-0.001 37 


0.9356 


0.0000 


(2) 


LOGCN 


8 


-0.22635 


0.9844 


2.7540 


(2) 


PAREN 


9 


-0.00737 


0.9503 


0.0028 


(2) 


PREMS 


10 


0.00000 


1.0000 


0.0000 


(2) 


CP 


12 


-0.19540 


0.9909 


2.0246 


(2) 


AV RE 


13 


0.00000 


1.0000 


0.0000 


(2) 


AXIOM 


14 


0.08716 


0.9995 


0.3904 


(2) 


THERM 


15 


0.53784 


0.9579 


20.7570 


(2) 


STEPS 


16 


-0.09022 


0.1432 


0.41b5 


(2) 


R INF 


17 


0.41054 


0.9996 


10.3381 


(2) 


AV TH 


18 


0.39748 


0.9875 


9.5693 


(2\ 


AV AX 


19 


0.44595 


0.9720 


12.6601 


(2) 


TOT R 


20 


0.48427 


0.9801 


15.6243 


(2) 


PSLI 


21 


-0.09039 


0.9879 


0.4201 


(2) 


POSIT 


22 


0.491 36 


0.9806 


16.2321 


(2) 
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TABLE 6.6B 



STEP NUMBER 2 FOR C4 



VARIABLE ENTERED 15 
MULTIPLE R 0.6664 
STD. ERROR OF EST. 2.7831 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 2 
RESIDUAL 51 



SUM OF 



SQUARES 
315.573 
395.020 



MEAN SQUARE 
157.786 
7.745 



F-RATIO 
20.371 



VARIABLES IN 
VARIABLE 



EQUATION : 

COEFFICIENT 



(CONSTANTS 
STD. 



2.21628 ) 
ERROR F TO REMOVE 



RE 


11 


1.76139 


0.33168 


28.2006 


(2) 


THERM 


15 


2.58059 


0.56642 


20.7570 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.63086 


0.4370 


33.0537 


(1) 


CLAS2 


2 


0.66398 


0. 3142 


39.4253 


(1) 


CLAS3 


3 


0.77776 


0.3602 


76.5551 


(1) 


CLAS5 


5 


0.89915 


0.5371 


211 .0452 


(1) 


WORDS 


6 


0.25715 


0.801 3 


3.5403 


(2) 


SYMBL 


7 


0.12449 


0.9004 


0.7871 


(2) 


LOGCN 


8 


-0.08188 


0.8968 


0.3375 


(2) 


PAR EN 


9 


-0.04238 


0.9477 


0.0899 


(2) 


PREMS 


10 


0.00000 


1.0000 


6.0000 


(2) 


CP 


12 


-0.05213 


0.9104 


0.1 363 


(2) 


AV RE 


13 


0.00000 


1.0000 


0.0000 


(2) 


AXIOM 


14 


0.35691 


0.8700 


7.2988 


(2) 


STEPS 


16 


0.06689 


0. 1329 


0.2247 


(2) 


R INF 


17 


0.31214 


0.91 21 


5.3973 


(2) 


AV TH 


18 


0.00579 


0.4571 


0.0017 


(2) 


AV AX 


19 


0.23603 


0. 7180 


2.9499 


(2) 


TOT R 


20 


0. 19374 


0.5311 


1 .9499 


(2) 


PSLI 


21 


0.13982 


0.8515 


0.9970 


(2) 


POSIT 


22 


0.21623 


0.5554 


2.4523 


(2) 
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TABLE 6.6C 



STEP NUMBER 3 FOR C4 



VARIABLE ENTERED 

MULTIPLE R 

STD. ERROR OF EST. 



14 
0.7176 
2.6256 



ANALYSIS OF 



VARIAi;CE: 
DF 



REGRESSION 
RESIDUAL 



3 
50 



SUM OF SQUARES 
365.891 
344.702 



r.EAN SQUAKi: 
121.964 
6.6^4 



17.691 



VARIABLES IN 
VARIABLE 
RE 11 
AXIOM 14 
THERM 1 5 



equation : 
coeff:;cient 

1.84887 
1 .51097 
3. 13748 



(CONSTANTS 1.03759 ) 

STD. ERROR F TO REMOVE 
0.31459 34.5392 (2) 

0.55928 7.298B (2) 

0. 57276 30.0069 (2) 



VARIABLES NOT IN EQUATION 



VARIABLE 


PARTIAL CORR. 


TOLfirlANCE 


F TO ENTtk 


CLAS1 


1 


0.58273 


0. 3907 


25.1952 ( 


1) 


CLAS2 


2 


0.64743 


0. 301 6 


35.3613 ( 


1 ) 


CLAS3 


3 


0.75805 


0. 3345 


66.1977 ( 


1) 


CLASS 


5 


0.88348 


0.4550 


174.2755 ( 


1) 


WORDS 


6 


0.24514 


0. 7960 


3.1329 ( 


2) 


SYMBL 


7 


0.13382 


0.9004 


0.8934 ( 


2) 


LOGON 


8 


0.05078 


0. 7846 


0.1267 ( 


2) 


PAREN 


9 


-0.06662 


0.9448 


0.2184 ( 


2) 


PREMS 


10 


0. OOOCO 


1 .0000 


0.0000 ( 


2) 


CP 


12 


0.06675 


0.8218 


0.21^^3 ( 


2) 




13 


0.00000 


1 .-.1 000 


0.0000 ( 


2) 




16 


0.09859 


0. " 32? 


0.'5b1J ( 


2) 


:i INF 


17 


0. 22129 


0 . J 4 J. 


?, ■«.•' { 


2) 


A7 TH 


18 


-0. 10233 


0. 422-:* 


o,y\ii} ( 


2) 


AV AX 


19 


0.05844 


0. 51 50 


C.lc/9 ( 


2) 


TOT R 


20 


0.02276 


0.4O31 


0 ' > '• 1 


2) 


PSLI 


21 


0.26233 


0. 7897 


3.^•213 ( 


2) 


POSIT 


22 


0.06511 


0.4409 


0.^1086 ( 


2) 
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TABLE 6.6D 



SUMMARY TABLE FOR C4: 



STEP VARIABLE 


MULTIPLE 


INCREASE 


F VALUE 


LAST KEG 


NUM ENT REM 


R 


RSQ 


IN RSQ 


FOR DEL 


COEFFICN rs 


1 


RE 


11 


0.46670 


0.21781 


0.21781 


14.4831 


1 .97847 


2 


THERM 


15 


0.66640 


0.44409 


0.22628 


20.7570 


4.71216 


3 


AXIOM 


14 


0.71760 


0.51495 


0.07086 


7.2988 


1.60187 


4 


PSLI 


21 


0.74050 


0.54834 


0.03339 


3.6213 


0.40789 


5 


R INF 


17 


0.75800 


0.57456 


0.02622 


2.9664 


3.13453 


6 


WORDS 


6 


0.77420 


0.59939 


0.02482 


2.9168 


0.11472 


7 


AV AX 


19 


0.78530 


0.61670 


0.01731 


2.0714 


0.11472 


8 


CP 


12 


0.79590 


0.63346 


0.01676 


2.0570 


3.6220b 


9 


PAREM 


9 


0.80290 


0.64465 


0.01119 


1.3767 


-1.47970 


1 0 


SYMBL 


7 


0.80860 


0.65383 


0.00919 


1.1491 


0.18363 


11 


LOGCN 


8 


0.81700 


0.66749 


0.01366 


1 .7186 


-3.38b08 


12 


POSIT 


22 


0.82280 


0.67700 


0.00951 


1.2225 


-0.29187 


13 


AV AX 


19 


0.82280 


0.67700 


0.00000 


0.0005 




♦i4 


TOT R 


20 


0.62720 


0.68426 


0.00726 


0.9382 


1.20037 


15 


STEPS 


16 


0.82800 


0.68558 


0.00132 


0.1589 


-0.14571 
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FIGUR E 6.5 A - RE SIDUALS FOR C4 ON RESTRICTKD SKT 



RANGE 0 10 20 30 40 50 
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-10.000- I 
-9.001 I 
-9. 000- I 
-8.001 I 
-8.000- I 
-7.001 I 
-7.000- I 
-6.001 I 
-6.000- I 
-5.001 I 
-5.000- I*» 
-4.001 I** 
-4.000- I* 
-3.001 I* 
-3. 000- I»»»* 
-2.001 I**** 
-2.000- 

—1.001 I»»»*»»»»»» 
—1.000- 

—.001 I************** 
.000- 
.999 
1 . 000- l»»»»»» 
1 . 999 1****** 
2.000- I*»* 
2.999 I»»» 
3.000- I*»»* 
3.999 I»*** 
4.000- I*» 
- 4.999 I»* 
5.000- I 
5.999 I 
6.000- I 
6.999 I 
7.000- I 
7.999 I 
8.000- I 
8.999 I 
9.000- I 
10.000 I 

I-I -I -I -I -I -I -I -I -I -I -I -I -I -I -I -I -I -I -I -I-I -I-I -I -I 
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FIGURE 6.5B - RESIDUALS (Y-AXIS) VS COMPUTED C4 (X-AXIS) 



■4.94 



•3.97 



-3.00 



1 1 1 



■2.03 



1 1 

1 1 
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1 

-1.06 . 1 2 

1 111 
1 

2 

1 1 
-0.08 . 2 11 2 

1 

1 1 

1 1 

1 

0.89 . 1 

1 



1.86 

11 

1 



2.83 

*1 1 

1 

3.81 . 1 
1 

1 
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TABLE 6.7A 



STEP NUMBER 1 FOR C5 



VARIABLE ENTERED 15 
MULTIPLE R 0.5588 
STD. ERROR OF EST. 2.4865 



ANALYSIS OF VARIANCE: 

DF 

REGRESSION 1 
RESIDUAL 52 



SUM OF SQUARES 
145.939 
321.487 



MEAN SQUARE 
145.939 
6.182 



F-RATIO 
23.605 



VARIABLES IN 
VARIABLE 



EQUATION : 

COEFFICIENT 



(CONSTANTS 
STD. 



3.51212 ) 

ERROR F TO REMOVE 



THERM 


15 


2.40632 


0.49528 


23.6054 


(2) 


VARIABLES NOT 


IN EQUATION: 








VARIABLE 


PARTIAL CORR. 


TOLERANCE 


F TO ENTER 


CLAS1 


1 


0.68615 


0.9988 


45.3724 


(1) 


CLA52 


2 


0.65212 


0.9992 


37.7357 


(1) 


CLAS3 


3 


0. 72700 


0.9909 


57.1707 


(1) 


CLAS4 


4 


0.91691 


0.&633 


269a2067 


(1) 


WORDS 


6 


0. 37663 


0.9072 


8.4304 


(2) 


SYMBL 


7 


0. 16829 


0.9444 


1 . 4864 


(2) 


LOGCN 


8 


-0.09201 


0.9006 


0.4354 


(2) 


PAREN 


9 


0.06370 


1 .0000 


0.2076 


(2) 


PREMS 


10 


0.00000 


1 .0000 


0.0000 


(2) 


RE 


11 


0.4S809 


0.9579 


14.3100 


(2) 


CP 


12 


-0.06387 


0.9117 


0.2089 


(2) 


AV RE 


13 


0.00000 


1 .0000 


0.0000 


(2) 


AXIOM 


14 


0.29555 


0.8793 


4.8812 


(2) 


STEPS 


16 


0.39348 


0.9162 


9.342b 


(2) 


R INF 


17 


0.23316 


0.91 38 


2.9320 


(2) 


AV TH 


18 


0.14183 


0.4587 


1 . 0469 


(2) 


AV AX 


19 


0.18814 


0.7217 


1.8715 


(2) 


TOT R 


20 


0.21967 


0.5311 


2.5858 


(2) 


PSLI 


21 


0. 13525 


0,8525 


0.9503 (2 


POSIT 


22 


0.24041 


0.5554 


3.1286 


(2) 
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TABLE 6.7B 



STEP NUMBER 2 FOR C5 



VARIABLE ENTERED 11 
MULTIPLE R 0.6804 
STD. ERROR OF EST. 2.2187 



ANALYSIS OF VARIANCE: 

DF SUM OF SQUARES MEAN SQUARE F-RATIO 

REGRESSION 2 216.380 108.190 21.979 

RESIDUAL 51 251.046 4.922 



VARIABLES IN EQUATION: (CONSTANT= 2.19584 ) 

VARIABLE COEFFICIENT STD. ERROR F TO REMOVE 

RE 11 1.00026 0.26442 14.3100 (2) 

THERM 15 2.75689 0.45155 37.2760 (2) 



VARIABLES NOT 
VARIABLE 

CLAS1 1 

CLAS2 2 

CLAS3 3 

CLAS4 4 

WORDS 6 

SYMBL 7 

LOGCN 8 

PAREN 9 

PREMS 1 0 

CP 12 

AV RE 13 

AXIOM 14 

STEPS 1 6 

R INF 17 

AV TH 18 

AV AX 19 

TOT R 20 

PSLI 21 

POSIT 22 



IN EQUATION: 
PARTIAL CORR. 
0.57328 
0.53388 
0.66357 
0.89915 
0.26095 
0.07789 

-0.13868 

-0.05040 
0.00000 

-0.09181 
0.00000 
0.39105 

-0.11690 
0.24086 
0.12946 
0.251 20 
0.24909 
0.13477 
0.27371 



TOLERANCE 


F TO ENTER 


0.4370 


24.4764 1 


:i) 


0.3142 


19.9325 ( 


:i) 


0.3602 


39.3372 


1) 


0.5559 


211 .0452 ( 


:i) 


0.8013 


3.6534 1 


[2) 


0.9004 


0.3052 1 


[2) 


0.8968 


0.9804 1 


2) 


0.9477 


0.1273 


.2) 


1.0000 


0.0000 1 


[2) 


0.9104 


0.4250 1 


[2) 


1.0000 


0.0000 1 


[2) 


0.8700 


9.0265 1 


[2) 


0.1329 


0.6927 ( 


[2) 


0.91 21 


3.0792 ( 


[2) 


0.4571 


0.8523 1 


[2) 


0.7180 


3.3676 


2) 


0.5311 


3.3074 ( 


[2) 


0.8515 


0.9249 1 


[2) 


0.5554 


4.0491 1 


.2) 
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TABLE 6. 7C 



STEP NUMBER 3 FOR C5 



VARIABLE ENTERED 14 
•••'ULTIPLE R 0.7383 
.^ro. ERROR OF EST. 2.0623 



Ai'iALYSIS OF VARIAJJCE: 

DF 

REGRESSION 3 
RESIDUAL 50 



SUM OF 



SQUARES 
254.770 
212.656 



MEAN SQUARE 
84.923 
4.253 



F-RATIO 
19.967 



VARIABLES IN 
VARIABLE 
RE 11 
AXIOM 14 
THERM 15 



EQUATION: 

COEFFICIENT 
1.07668 
1.31979 
3.24332 



(CONSTANTS 
STD. 



1.16628 ) 
ERROR F TO REMOVE 
0.24710 18.9861 (2) 

0.43928 9.0265 (2) 

0.44987 51.9762 (2) 



VARIABLES NOT IJJ EQUATION: 



VARIABLE 


PARTIAL CORR 


CLAS1 


1 


0.51253 


CLAS2 


2 


0.50526 


CLAS3 


3 


0.63034 


CLAS4 


4 


0.88348 


V70RDS 


6 


0.24994 


SYMBL 


7 


0.08524 


LOGCN 


8 


-0.00040 


PAREN 


9 


-0.07841 


PREMS 


10 


0.00000 


CP 


12 


0.03457 


AV RE 


13 


0.00000 


STEPS 


16 


-0.09751 


R INF 


17 


0. 12979 


AV TH 


18 


0.02529 


AV AX 


19 


0.05548 


TOT R 


20 


0.07122 


PSLI 


21 


0.27094 


POSIT 


22 


0. 11727 



TOLERANCE 


F TO ENTER 


0.3907 


17.4574 ( 


1) 


0.3016 


16.7969 ( 


1) 


0.3345 


32.3053 ( 


1) 


0.4851 


174.2755 ( 


1) 


0. 7960 


3.2649 ( 


2) 


0.9004 


0.3586 ( 


2) 


0.7846 


0.0000 1 


2) 


0.9448 


0.3031 ( 


2) 


1 .0000 


0.0000 ( 


2) 


0.8218 


0.0586 1 


2) 


1 . 0000 


0.0000 1 


2) 


0.1322 


0.4703 ( 


2) 


0.8144 


0.3396 1 


2) 


0.4229 


0.0314 1 


2) 


0.5150 


0.1513 1 


2) 


0.4031 


0.2498 1 


2) 


0.7897 


3.8821 1 


2) 


0. 4409 


0.6833 1 


2) 
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TABLE 6.7D 



SUMMARY TABLE FOR C5; 



STEP VARIABLE 


MULTIPLE 


INCREASE 


F VALUE 


LAST REG 


NUM ENT REM 


R 


RSQ 


IN RSQ 


FOR DEL 


COEFFICNT. 


1 


THERM 


15 


0.55880 


0.31226 


0.31226 


23.6054 


3.93706 


2 


RE 


11 


0.68040 


0.46294 


0. 15069 


14.3100 


2.14152 


3 


AXIOM 


14 


0.73830 


0.54509 


0.08214 


9.0265 


1 .37732 


4 


PSLI 


21 


0.76060 


0.57851 


0.03343 


3.8821 


0.33648 


5 


STEPS 


16 


0.77310 


0.59768 


0.01917 


2.2905 


-0.52878 


6 


WORDS 


6 


0.78660 


0.61874 


0.02106 


2.5907 


0.10568 


7 


PAR EM 


9 


0,79740 


0.63585 


0.01711 


2.1760 


-0.87813 


8 


R INF 


17 


0.80470 


0.64754 


0.01170 


1 .4924 


1 .69602 


9 


SYMBL 


7 


0.81400 


0.66260 


0.01505 


1.9577 


0.101B2 


1 0 


CP 


12 


0.81630 


0.66635 


0.00375 


0.4876 


1 .98407 


1 1 


LOGCN 


8 


0.81890 


0.67060 


0.00425 


0.5385 


-1 .60871 


12 


POSIT 


22 


0.82090 


0.67388 


0.00328 


0.4165 


-0.18563 


13 


TOT R 


20 


0.82540 


0.68129 


0.00741 


0.9287 


0.86059 


14 


AV TH 


18 


0.82560 


0.68162 


0.00033 


0.0415 


0.1 1391 
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FIGURE 6.6A - RESIDUALS FOR C5 OJl RESTRICTED SET 



RANGE 0 10 20 30 40 50 

I-I -I -I -I -I -I -I -I -I -I -I-I -I -I -I -I -I -I -I -I -I -I -I -I -I 
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I 
I 
I 
I 
I 
I 
I 
I 
I 

I* 
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-4.001 
-4.000- 

-3.001 I* 
-3.000- I*** 
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—2.000- !*********»* 
—1.001 I*********** 
—1.000- 

-.001 I***************** 
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.999 I******* 
1.000- I******* 
1.999 I******* 
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6.999 
7.000- 
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FIGURE 6.6B RESIDUALS (Y-AXI S ) VS COKPUTED CD (X-AXIS ) 



-3.01 



-2.36 



-1.70 



-1.05 



-0.39 



0.26 



0.92 



1.57 



2.23 



2.88 
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CHAPTER SEVEN 



A subset of the set of proofs for problem 414035 was 
presented in Chapter III, to illustrate the classification 
orocedure. In this chapter, the full set of proofs tor 
three problems will be presented, to provi-.le additional 
insight into the nature of the differences in the sample as 
a whole. The first problem discussed is drawn from the 
early part of the curriculum, before the intoduction of RE, 
and exibits very little variation for all of the five 
oartitions» The second problem comes after the intoduction 
of RE, but before the introduction of the first theorem. 
It shows considerable variation under the first two 
partitions but very little under the last three. The last 
problem occurs when four theorems are available and shows 
considerable variation under all five partitions. 

As in Chapter III , paradigm proofs identify the 
different classes under each of the first three partitions. 
All of the proofs in a class are equivalent to the paradigm 
proof, up to the differences allowed under the partition 
being discussed. For the last two partitions, individual 
classes are identified by the distribution which defines 
the class. All of the classes under a partition are 
referred to by letters of the alphabet; the numbers that 
appear after these letters are the number of student proofs 
included in the class. 

7.1 - PROBLEM 4070 10 

Problem 407010 is fairly typical of those occurring 
before the introduction of RE. The statement of this 
problem is; 



There are three classes under the first partition (Table 
7.1) and only one class urider each subsequent partition. 

The first partition is defined by the identity 
relation; even under this strict definition of equivalence, 
there are only three classes of proofs in the sample of 
twenty three. For the second partition, there is just one 



407010: 

DERIVE 6+A = (5+1 )+A 
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class. All twenty three proofs are equivalent up to 
defferences in unused steps. There is little variation in 
the set of proofs for this problem, and the variations 
which do occur are relatively superficial. 

7.2 PROBLEM 411051 

The statement of this problem is: 



Problem 411051 occurs when RE is available. The tirst 
partition for this problem contains eleven classes of 
proofs. The paradigm proofs for these eleven classes are 
listed in Table 7^2A. 

All of the proofs in Table 7.2A use the same six 
rules: WP, SE, ND, CE, RE, and CP; these rules are used in 
a consistent way from proof to proof. In each case, WP is 
used to generate the formula, A = 10, and ND is used to 
generate 7 = 6+1. CE and SE are used to modify A = 10, and 
RE is used to combine the formula derived from A = 10 and 
the formula, 7 = 6+1. Finally, CP is used to generate the 
required conditional, 10 = A -> A-7 = 10- (6+1). The proofs 
differ in the order in which these rules are used and in 
the presence, in some proofs, of unused steps. For 

example, the only difference between proofs A and D is 
the postion of the step employing ND. Proofs B and D are 
the same, except for the two unused steps, (5) and (6), in 
proof B. 

Under the second partition, there are seven classes or 
proofs. The paradigm proofs are found in Table 7.2B. The 
criteria which define the second partition ignore unused 
steps, therefore no unused steps appear in the paradigm 
proofs for this partition. 'All of these proofs contain six 
steps, using the same six rules. The order in which these 
rules are used, however, changes from one proof to another. 

The three paradigm proofs for the third partition are 
contained in Table 7.2C. Again, the only differences 
between these proofs are in the order of rule use. Under 
the second partition any difference in order results in a 
separate classification; the third partition is sensitive 



411051 : 

DERIVE 10 = A -> A-7 = 10- (6 + 1) 
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to some differences in order but not to all • in this 
oroblem, 411051, the third partition ignores the position 
of the step using ND, but does not ignore changes in the 
order in which CE, SE, and RE are introduced* These three 
rules are used, in some order, to successively modify 10 r- 
A; various sequences of these three steps constitute the 

core of the proofs* The ND-step is introduced only to be 
used with RE and can occur anywhere before RE* Most of the 
variation observed for the first two partitions is likewise 
due to the differences in the position of the ND-step* 

All of the proofs in the sample for this problem are 
in the same class under the third and fourth partitions, 
since they all use the same six rules and use each of them 
only once* 

7*3 PROBLEM 415044 

The final example to be discussed is problem 415044* 
The statement for this problem is: 



415044: HERE IS THEORM 5 
DERIVE: 0=0 



Under the first partition there are sixteen classes in the 
sample of twenty three proofs; a list of the paradigm 
oroof s for each of these classes is presented in Table 
7*3A* 

The proof labeled D, in Table 7*3A, is the standard 
proof for this problem; it uses two theorems, TH3 and TH4* 
Six students constructed the standard proof; this is the 
largest number of proofs in any of the sixteen classes* C 
is the class with the second largest number of student 
proofs* 

The differences between C and D are worth discussing 
in detail* The first step in C is identical to the first 
step in D* The second step in proof D uses TH3 to generate 
the formula, 0-0 = 0* Proof C uses three steps to generate 
the same formula; these three steps are a special case of 
the proof of TH3* Both C and D then use RE to complete the 
proof* 

The students who constructed proof C recognized that 
they needed the formula, 0-0 = 0, but did not realize that 
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this formula could be generated in one step by using TK3. 
So they oroved this instance of TK3f using the axioms AI 
and N and the rule, RE, which form the standard proof of 
TH3. A slightly different version of this proof for the 
necessary instance of TH3 is found in proof J, while proof 
E uses TH3 and includes a derivation of the needed instance 
of TH4, Since every theorem in the curriculum may be 
oroved using the axioms and rules of inference, it is never 
necessary to use a theorem; any instance of a theorem can 
be proved using the axioms and rules of inference. 

Proofs F, O, and P use no theorems (The single 
occurrence of TH4 in proof P is in cui unused step#). Proof 
K, on the other hand, uses TH1 and TH2 with CA and RE. 

In addition to this basic variation in the rules used, 
there are differences in the order in which the rules are 
used and in the presence of unused steps. Proof H, for 

example, is the same as proof D except for its unused 
second step. 

The paradigm proofs for the second partition are 
listed in Table 7.3B. Unused lines are ignored under the 
second partition, so the number of classes decreases from 
sixteen to fourteen. The paradigm proofs for the third 
partition form Table 7.3C. Here some variation in the 
order of steps is allowed, and the number of classes is 
reduced to twelve. 

Moving from the third partition to the fourth, tv/o of 
these twelve classes are combined, leaving a total of 
eleven classes (Table 7. 3D). None of these merge under the 
fifth partition, which also contains eleven classes. For 
this last problem, then, the decrease in the niamber of 
classes from one partition to the next is very gradual. 
The reason for this was indicated in the discussion of the 
first are combined, leaving a total of eleven classes 
(Table 7. 3D). None of these merge under the fifth 
partition, which also contains eleven classes (Table 7.3E). 
For this last problem, then, the decrease in the number of 
classes from one partition to the next is very gradual. 
The reason for this was indicated in the discussion of the 
first partition. The proofs differ principally in the set 
of rules employed. Since all five partitions are sensitive 
to such differences, this component of variation does not 
disappear for the later partitions. 

Two additional types of variation appear under the 
first partition, the presence of unused steps in some 
proofs and the variations in the order of steps. These 
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types of variation become irrelevant for the later 
oartitions and disappear. In the previous examples 

discussed here, differences in the order of steps accounted 
for most of the differences observed, so the number of 
classes decreased rapidly from the first partition to the 
fifth. 
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TABLE 7.1 



FIRST PARTITION FOR PROBLEM 407010 





. .ND6 


(1) 


6 » 


5-I-1 


A 




. .aeCa] 


(2) 


6-t-A 


a (5-t-l)-t-A 






. .ND6 


(1) 


6 = 


5-t-l 


B 


1. 


. .AE[A] 


(2) 


6-t-A 


= (5-t-l)-t-A 






. .ND5 


(1) 


5 = 


4-t-l 


C 




. .DLL 


(1) 










. .ND6 


6 = 


5-t-l 




1.* 


. .aeCa] 


(2) 


6-t-A 


= (5-l-l)-t-A 





♦ The numbers in parentheses to the right of 
each proof are the number of proofs in the 
class. 
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FIKSi PARTITION FOR PROBLEM 411U51 



• • 


.WP[lO=:Aj 


(1) 


10 = A 


• • 


.ND7 


(2) 


7 = 6+1 


1. . 


.CE1 


(3) 


A = 10 


3. . 


.SE[7] 


(4) 


A-7 = 10-7 


4. 2. 


.rb:2 


(5) 


A-7 = 10- (5+1 ) 


1. 5. 


.CP 


(^^) 


10 = A -> A-7 = 



(4) 



3. 
4. 
1 . 







.wp[io=r-.! 


(1) 


10 = A 


) u 




.CE1 


(2) 


A = 10 


' 2. 




.SE[7] 


(3) 


A-7 = 10-7 






.ND7 


(4) 


7 = 6+1 






.CE1 


(5) 


6+1 = 7 


\ 3. 


4. 


.RE1 


(5) 


A-(6+1) = 7 


3. 


4. 


.RE2 


(7) 


A-7 = 10- (6+1) 


1 


7. 


.CP 


(3) 


10 = A -> A-7 =10 






.WpriO=Aj 


(1) 


10 = A 






.ND7 


(2) 


7 = 6+1 






.SE[7] 


(3) 


10-7 = A-7 


3. 


2. 


.RE2 


(4) 


10-7 = A-(6+1 ) 



(1) 



C (1) 



.DLL 
.CE1 
.RE2 
.CP 



(4) A-7 = 10-7 

(5) A-7 = 10- (6+1) 

(6) 10 = A -> A-7 =10- (6+1) 



• 


• 


.WP[10=A] 


(1) 


10 = A 


1. 


• 


.CE1 


(2) 


A = 10 


2. 


• 


.SE[7] 


(3) 


A-7 = 10-7 


• 


• 


.ND7 


(4) 


7 = 6+1 


3. 


4. 


.RE2 


(5) 


A-7 = 10- (6+1) 


1 . 


5. 


.CP 


(6) 


10 = A -> A-7 =10 



D (i>) 



• 


• 


.WP[10=A] 


(1) 


10 = A 


• 


• 


.CE1 


(2) 


A 10 


• 


• 


.ND7 


(3) 


7 = 6f1 


2. 


• 


.SE[7] 


(4) 


A-7 10-7 


4. 


3. 


.RE2 


(5) 


A-7 ^ 10-(6+1 ) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10 



E (1) 



ERIC 
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• 


• 


.WP[10=A] 


(1) 


10 = A F 


1. 


• 


.CE1 


(2) 


A = 10 


• 


• 


.ND7 


(3) 


7 = 6 + 1 


3. 


• 


.CE1 


(4) 


6+1 = 7 


2. 


• 


.SE[7] 


(5) 


A-7 = 10-7 


5. 


3. 


.RE2 


(6) 


A-7 = 10- (6+1) 


1. 


6. 


.CP 


(7) 


10 = A -> A-7 =10-(6+1) 



• 


• 


.WP[10=A] 


(1) 


1. 


• 


.SE[7] 


(2) 


• 


• 


.ND7 


(3) 


2. 


3. 


.RE1 


(4) 


4. 


• 


.CE1 


(5) 


1. 


5. 


.CP 


(6) 



10 = A G (2) 

10-7 = A-7 

7 = 6+1 

10- (6+1) =A-7 

A-7 = 10- (6+1) 

I = A -> A-7 =10-(6+1 ) 



• • 


.WP[10=A] 


(1) 


• • 


.CE1 


(2) 


2. . 


.SE[7] 


(3) 




.ND7 


(4) 


3! 4! 


.RE2 


(5) 


1. 5. 


.CP 


(6) 



10 = A H (1) 

A = 10 
A-7 = 10-7 
7 = 6+1 

A-7 = 10-(6+1) 
10 = A -> A-7 =10-(6 + l) 



• • 


.WP[lOa!A] 


(1) 


• • 


.ND7 


(2) 


1. . 


.SF[7] 


(3) 


3. 2. 


.RE1 


(4) 


4. . 


.CE1 


(5) 


1. 5. 


.CP 


(6) 



. . .WP[10=A] (1) 

1. . .SE[7] (2) 
. . .ND7 (3) 

2. 3. .RE2 (4) 

1. 4. .CP (5) 
. . . DLL 

.DLL 

2. . .CE1 
4. 3. .RE2 
1. 5. .CP 



. . .WP[li3=A] (1) 

1. . .CR1 (2) 

. . .ND7 (3) 
. DLL 

. . .SE[7] (3) 



10 = A 1(1) 
7 = 6+1 
10-7 = A-7 
10-(6+1) = A-7 
A-7 = 10-(6+1) 
10 = A -> A-7 ^^^0-{b+^ ) 



10 = A J (1 ) 

10-7 = A-7 
7 = 6+1 

10-7 = A-(6+l) 
10 = A -> A-7 = A-(6+1 ) 



10 = A K (1) 

A = 10 
7 = 6+1 

A-7 = 10-7 



(4) A-7 = 10-7 

(5) A-7 = 10-(6+l) 

(6) 10 = A -> A-7 =10-(6+l) 
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. . .ND7 
3. 4. .RE2 
1. 5. .CP 



(4) 7 = 6+1 

(5) A-7 = 10-(6+1) 

(6) 10 = A -> A-7 =10-(6+1) 



o ( 1. 53 

ERIC I 



• • 


.WP[10=A] 


(1 ) 10 = A B,D,H,K 


1. . 


.CE1 


(2) A = 10 


2. . 


.SE[7] 


(3) A-7 = 10-7 




. .ND7 


(4) 7 = 6+1 


3. 4. 


.RE2 


(5) A-7 = 10-(6+i ) 


1. 5. 


.CP 


(6) 10 = A -> A-7 =10-(6+1) 
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TABLE 7.2B 



SECOND PARTITION FOR PROBLEM 411051 



• • 


.WP[lOaA] 


(1) 


10 = A 


• • 


.ND7 


(2) 


7 = 6+1 


1. . 


.CEI 


(3) 


A = 10 


3. . 


.SE[7] 


(4) 


A-7 = 10-7 


4. 2. 


.RE2 


(5) 


A-7 = 10-(6+1) 


1. 5. 


.CP 


(6) 


10 = A -> A-7 = 



A (4) 



• 




.WP[10=A] 


(1) 


10 = A C 


• 




.ND7 


(2) 


7 = 6+1 


1. 




.SE[7] 


(3) 


10-7 = A-7 


3. 




.CEI 


(4) 


A-7 = 10-7 


4. 


2. 


.RE2 


(5) 


A-7 = 10-(6+1) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10-(6+1) 


• 




.WP[10=A] 


(1) 


10 = A E,F 


• 




.CEI 


(2) 


A = 10 


• 




.ND7 


(3) 


7 = 6+1 


2. 




.SE[7] 


(4) 


A-7 =10-7 


4. 


3.* 


.RE2 


(5) 


A-7 = 10-(6+1 ) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10-(6+1 ) 



(1) 



(2) 



• • 


.WP[10=A] 


(1) 


10 = A G 


(2) 


1. . 


.SE[7] 


(2) 


10-7 = A-7 






.ND7 


(3) 


7 = 6+1 




2*. 3! 


.REI 


(4) 


10- (6+1) =A-7 




4. . 


.CEI 


(5) 


A-7 = 10-(6+1 ) 




1. 5. 


.CP 


(6) 


10 = A -> A-7 =10-(6+1) 




• • 


.WP[lOsA] 


(1) 


10 = A I 


(1) 


• • 


.ND7 


(2) 


7 = 6+1 





i 
I 
I 
I 
I 
I 
I 
I 
i 
I 
I 
I 
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1. 


• 


.SE[7] 


(3) 


10-7 = A-7 


3. 


2. 


.RE1 


(4) 


10-(6+l) = A-7 


4. 


• 


.CE1 


(5) 


A-7 = 10-(6+l) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10-(6 + 1 ) 


• 




.WP[10=A] 


(1) 


10 = A J 


1. 




.SE[7] 


(2) 


10-7 = A-7 


• 




.ND7 


(3) 


7 = 6+1 


2. 




.CE1 


(4) 


A-7 = 10-7 


4. 


s! 


.RE2 


(5) 


A-7 = 10-(6+l) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10-(6 + 1 ) 



^.55 
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TABLE 7.2C 



THIRD PARTITION FOR PROBLEM 411051 



• 


• 


.WP[10=AJ 


(1) 


10 = A A, 


• 


• 


.ND7 


(2) 


7 = 6+1 


1. 


• 


.CE1 


(3) 


A = 10 


3. 


• 


.SE[7] 


(4) 


A-7 = 10-7 


4. 


2. 


.RE2 


(5) 


A-7 = 10- (6+1) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 = 



A,B,D,H,K,E,F (IB) 



• 


• 


.WP[10=AJ 


(1) 


10 = A 


• 


• 


.ND7 


(2) 


7 = 6+1 


1. 


• 


.SE [7] 


(3) 


10-7 = A-7 


3. 


• 


.CE1 


(4) 


A-7 = 10-7 


4. 


2. 


.RE2 


(5) 


A-7 = 10- (6+1) 


1. 


5. 


.CP 


(6) 


10 = A -> A-7 =10 



C,J (2) 



. . .WP[10=A] 
1. . .SE[7] 

. . .ND7 
2. 3. .RE1 
4. . .CE1 
1. 5. .CP 



(1 ) 10 = A 

(2) 10-7 = A-7 

(3) 7 = 6+1 

(4) 10-(6+1) =A-7 

(5) A-7 = 10- (6+1) 

(6) 10 = A -> A-7 =10-(6+l) 



G,I (3) 



ERIC 
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I 



152 



TABLE 7.3A 



FIRST PARTITION FOR PROBLEM 415044 



I 



I 
I 
1 

ERIC 1 



• 




.N [A, A] 


(1 ) 


A+(-A) = A-A 
** ' \ • * / •» «» 


• 




.AI [A] 


(2) 


A+(-A) = 0 


1 

1 • 


. 


• x\r» 1 




0 = A— A 






.TH1 [0] 


(4) 


0+0 = 0 


• 




.AI[0] 


(5) 


0+(-0) = 0 


5. 




.CE1 


(6) 


0 = 0+(-0) 


4. 




.RE3 


(7) 


0+0 = 0+(-0) 


7. 




.SE[0] 


(8) 


(0+0)-0 = (0+(-0))-0 






.TH3[0] 


(9) 


0-0 = 0 


s! 


4.* 


.RE1 


(10) 


0-0 = (0+(-0))-0 


10. 


9. 


.RE1 


(11) 


0 = (0+(-0))-0 


11 . 




.CA1 


(12) 


0 = ((-0)+0)-0 


• 




.TH4[0] 


(13) 


0-0 =: -0 


9. 




.CE1 


(14) 


0 = 0-0 


13. 


9.* 


.REl 


(15) 


0 = -0 


• 


• 


.TH3[0] 


(1) 


0-0 = 0 


• 


• 


.TH4[0] 


(2) 


0-0 = -0 


1. 


2. 


.REl 


(3) 


-0 = 0 


3. 


• 


.CE1 


(4) 


0 = -0 



2. 3 
1. 4 



1. 2, 



.TH4[0 

.AI[0] 

.N[0,0 

,RE1 

.REl 



.TH4[0 
.TH3[0 
,RE1 



.Z L-oj 

.CA1 

,N[0,0 

.REl 

,TH3[0 

.REl 



A (1) 



(1) 0-0 = -0 

(2) 0+(-0) = 0 

(3) 0+(-0) = 0-0 

(4) 0-0 = 0 

(5) 0 = -0 



(1) 0-0 = -0 

(2) 0-0 = 0 

(3) 0 = -0 



(1) (-0)+0 = -0 

(2) 0+(-0) = -0 

(3) 0+(-0) = 0-0 

(4) 0-0 = -0 

(5) 0-0 = 0 

(6) 0 = -0 
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a 11; 



C (3) 



D (6) 



E (1) 
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( 1 ^ 


0 =C 


) 








\ * ) 


0+(- 


-0) = 0 


^ • 








0 = 


0+(-0) 


1 






\ 4 ; 


0 = 


0+(-0) 






.z[-o] 


(5) 


(-0)+0 = -0 


c 

D m 








0+(- 


-0) = -0 


4. 


e! 


.RE1 


(7) 


0 = 


-0 








1 1 ^ 


0+0 


= 0 






.TH2L0] 


(2) 


(-0)+0 = 0 






-* 1 n 0 L ^ J 




0-0 


= 0 






.TH4[0] 


(4) 


0-0 


= -0 


3 




CPA 




0 = 


0-0 


1 • 


c 

D • 






0+0 


= 0-0 


a 


A 


P R1 
• AC/ 1 


I ' ^ 


0+0 


= -0 


7 


1 

1 • 


• A Ci 1 




0 = 


-0 




• 


.TH4[03 


(1) 


0-0 


= -0 




• 


• irli L^J 




0+0 


= 0 




• 






C-0 


= 0 


1 • 




T? T?1 




0 = 


-0 




• 


fpij ^ r A 1 
• TnJ LUJ 


( 1 ) 


0-0 


= 0 




• 


.DLL 










• 


fVrs A r A 1 


\ 1 ) 


0-0 


= -0 




• 


• ilio I.UJ 


12 ) 


0-0 


= 0 


1 • 






/ ■3^ 
\ ^) 


0 = 


-0 




• 


.TH4[0] 


(1) 


0-0 


= -0 




• 


.N[0,0] 


(2) 


0+(- 


-0) = 0- 




• 


. AI [0] 


(3) 


0 + (- 


-0) = 0 


3! 


2. 


.RE1 


(4) 


0-0 


= 0 


1 . 


4. 


.RE1 


(5) 


0 = 


-0 




• 


.TH1 [-0] 


(1) 


0+(- 


-0) = -0 


1 . 


• 


.CA1 


(2) 


(-0)+0 = -0 




• 


.TH2[0] 


(3) 


(-0) + 0 = 0 




2. 


3. .RE1 




(4) 


0 = -0 



F (1) 



G (1) 



H (1) 



I (1) 



J (1) 



K (1) 



.N[0.0] 
.Z[0] 

.z[-o] 

.CA1 
.RE1 



(1) 0+(-0) = 0-0 

(2) 0+0 = 0 

(3) (-0)+0 = 0 

(4) 0+(-0) = 0 

(5) -0=0-0 



L (1) 



1158 
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• 


• 


.TH3[0] 


(6) 


0-0 = 0 






• RE1 


(7) 


-0=0 




• 


• CE1 


(8) 


0 = -0 


• 




• LT [0] 


(1) 


0 = 0 


• 




• TH4 LOJ 


(2) 


0-0 = -0 


• 




• TH3 LOJ 


(3) 


0-0 = 0 


1 • 




• SE [0] 


(4) 


0-0 = 0-0 


A 


3. 


• RE1 


(5) 


0 = 0-0 


5. 


2. 


.RE1 


(6) 


0 = -0 






.LT[0] 


(1) 


0 = 0 






• AE [0] 


(2) 


0+0 - 0+0 






.DLL 










. AE [-Oj 


(2) 


0+(-0) = 0+(-0) 






.TH4[0J 


(3) 0-0 = -0 






.TH2[0] 


(4) 


(-0)+0 = 0 


4. 




.CA1 


(5) 


0+(-0) = 0 


2. 


5. 


.RE1 


(6) 


0 = 0+(-0) 






.z[o] 


(7) 


0+0 = 0 






.DLL 










.z[-o] 


(7) 


(-0)+0 = -0 


7. 




.CA1 


(8) 


0+(-0) = -0 






.RE1 


(9) 


0 = -0 


• 


• 


.LT[-Oj 


(1) 


-0 = -0 


• 


• 


.N [0,0] 


(2) 


0+(-0) = 0-0 


• 


• 


.z [-0] 


(3) 


(-0)+0 = -0 




3. 


. .CE1 


(4) -0 = (-0)+0 


1. 


4. 


.RE1 


(5) 


(-0)+0 = -0 


• 


• 


.AI [0] 


(6) 


0+(-0) = 0 


6. 


• 


.CA1 


(7) 


(-0 + 0 = 0 






5. 7. .RE1 




(8) 0 = -0 






.TH4[0] 


(1) 


0-0 = -0 






.z[-o] 


(2) 


(-0)+0 = -0 


2. 




.CA1 


(3) 


0+(-0) = -0 






.AI [0] 


(4) 


0+(-0) = 0 


3! 


4! 


.RE1 


(5) 


0 = -0 



K (1) 



N (1) 



O (1) 



P (1) 
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TABLE 7.33 



SECOND PARTITION FOR PROBLEM 415044 



. . .TH3[0] 
. . .TH4[0] 
2. 1. .RE1 



(1) 0-0=0 

(2) 0-0 = -0 

(3) 0 = -0 



A (1) 



1. 2. 
3. . 



.TH3[0] 
,TH4[0] 
.REI 
,CE.1 



(1) 0-0=0 

(2) 0-0 = -0 

(3) -0=0 

(4) 0 = -0 



B (1) 



. , .TH4[0] 

. . .AI[0] 

. . .N[0,0] 

2. 3. .REI 

1. 4. .REI 



(1) 0-0 = -0 

(2) 0+(-0) = 0 

(3) 0+(-0) = 0-0 
(4) 0-0 = 0 

(5) 0 = -0 



C (3) 



. . .TH4[0] 
. . .TH3[0] 
1. 2. .REI 



( 1 ) 0-0 = -0 

(2) 0-0 = 0 

(3) 0 = -0 



D,H,I (8) 



2. 3. 
4. 5. 



.Z[-0] 

.CA1 

.N[0,0] 

.REI 

.TH3[0] 

.REI 



(1) (-0)+0 = -0 

(2) 0+(-0) = -0 

(3) 0+(-0) = 0-0 

(4) 0-0 = -0 

(5) 0-0 = 0 . 

(6) 0 = -0 



E (1) 



• 


• 


.LT 


[0] 


• 


• 


.AI 


:o] 


2. 


• 


.CE1 




1. 


3. 


.RE2 


• 


• 


.z[-o] 


5. 


• 


.CAl 


4. 


6. 


.REI 



(1) 0 =0 

(2) 0+(-0) = 0 

(3) 0 = 0+(-0) 

(4) 0 = 0+(-0) 

(5) (-0)+0 = -0 

(6) 0+(-0) = -0 

(7) 0 = -0 



F (1) 



.TK1 [0] 
.TH3[0] 
.TH4[0] 



(1) 0+0=0 

(2) 0-0=0 

(3) 0-0 = -0 



G (1) 
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2 




.CE1 


(4) 


0 = 0-0 


4 

1 • 


A 

**• 




I ^ ) 


U + U =: 0—0 


c 

D • 




• RET 




0+0 = -0 


D • 


1 • 


OT71 ^ 


( "7 ^ 


n r\ 

0 = — 0 


• 




.TH4[0] 


(1) 


0-0 = -0 


• 




.N [0,0] 


(2) 


0+(-0) = 0-0 


• 




. AI [0] 


(3) 


0+(-0) = 0 


3. 


2! 


.RE1 


(4) 


0-0 = 0 


1 . 


4. 


.RE1 


(5) 


0 = -0 


• 




.TH1 [-0] 


(1) 


0+(-0) = -0 


1 . 




.CA1 


(2) 


(-0)+0 = -0 


• 




.TH2[0] 


(3) 


(-0)+0 = 0 


2. 


3.* 


.RE1 


(4) 


0 = -0 



• • 


.N[0,0] 


(1) 


0+(-0) = 


0-0 




.Z[-0] 


(2) 


(-0)+0 = 


0 


2. . 


.CAI 


(3) 


0+(-0) = 


0 


1. 3. 


.RE1 


(4) 


-0 = 0-0 






.TH3[0] 


(5) 


0-0 = 0 




4! 5.* 


.RE1 


(6) 


-0 = 0 




6. . 


.CE1 


(7) 


0 = -0 





• • 


.LT[0] 


(1) 


0 = 


0 


• • 


.TH4[0] 


(2) 


0-0 


= -0 


• • 


.TH3[0] 


(3) 


0-0 


= 0 


1. . 


.SE[0] 


(4) 


0-0 


= 0-0 


4. 3. 


.RE1 


(5) 


0 = 


0-0 


5. 2. 


.RE1 


(6) 


0 = 


-0 



J (1) 



K (1) 



L (1) 



K (1) 



.LT[0] (1)0 = 0 N (1 ) 

.AE[-0] (2) 0 + (-0) = 0+(-0) 

.TH2[0] (3) (-0)+0 = 0 

.CAI (4) 0+(-0) = 0 

.RE1 (5) 0 = 0+(-0) 

.Z[-0] (6) (-0)+0 = -0 

."'^l (7) 0 + (-0) = -0 • 

.RE1 (8) 0 = -0 



.LT[-0] (1) -0 = -0 0(1) 

.Z[-0] (2) (-0)+0 = -0 

.CE1 (3) -0 = (-0)+0 

.RE1 (4) (-0)+0 = -0 
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5. . 
4. 6. 



• t 

1 . . 

2. 3. 



.AI [0] 

.CA1 

.RE1 



.Z[-0] 
.CA1 
.AI [Oj 
.RE1 
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(5) 0+(-0) = 0 

(6) (-0)+0 = 0 

(7) 0 = -0 



(1) (-0)+0 = -0 

(2) 0+(-0) = -0 

(3) 0+(-0) = 0 

(4) 0 = -0 



P (1) 



I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
I 
1 
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TABLE 7.3C 



THIRD PARTITION FOR PROBLEM 415044 



. . .TH3[0] (1) 0-0= 0 A,D,H,I (9) 

. . .TH4[0] (2) 0-0 = -O 

2. 1. .RE1 (3) 0 = -0 



. . .TH3[0] (1)0-0=0 B (1 ) 

. . .TH4[0] (2) 0-0 = -0 

1. 2. .RE1 (3) -0=0 

3. . .CE1 (4) 0 = -0 



. . .TH4[0] (1) 0-0 = -0 C,J (3) 

. . .AI[0] (2) 0+(-0) = 0 

. . .N[0,0] (3) 0+(-0) = 0-0 

2. 3. .RE1 (4) 0-0 = 0 

1. 4. .RE1 (5) 0 = -0 



. . .Z[-0] (1) (-0)+0 = -0 E (1 ) 

. . .CA1 (2) 0+(-0) = -0 

. . .N[0,0] (3) 0+(-0) = 0-0 

2. 3. .RE1 (4) 0-0 = -0 

. . .TH3[0] (5) 0-0 = O 

4. 5. .RE1 (6) 0 = -0 



. . .LT[0] (1) 0 =0 F (1 ) 

. . .AI[0] (2) 0+(-0) = 0 

2. . .CE1 (3) 0 = 0 + (-.0) 

1. 3. .RE2 (4) 0 = 0+(-0) 

. . .Z[-0] (5) (-0)+0 = -0 

5. . .CA1 (6) 0+(-0) = -0 

4. 6. .RE1 (7) 0 = -0 



G (1) 



• 


• 


.TH1 [0] 


(1) 0+0 


= 0 




• 


. .TH3[0] 


(2) 0- 


■0 = 0 


• 


• 


.TH4 [0] 


(3) 0-0 


= -0 


2 


• 


.CE1 


(4) 0 = 


o-o 


1. 


4. 


.RE3 


(5) 0+0 


= 0-0 


5. 


3. 


.REI 


(6) 0+0 


= -0 


6. 


1. 


.RE1 


(7) 0 = 


-0 



If 3 



.LT[0] 
. TH4 [0] 
.TH3[0] 

.seLo] 

.RE1 
.RE1 



1b9 





• 


.TH1 [-0] 


(1) 


0+(-0) = 


-0 


1 • 


• 


• CAT 


12; 


l-o;+o = 


—0 
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.TH2[0] 


(3) 


(-0)+0 = 


0 


2! 


3. 


.RE1 


(4) 


0 = -0 








.N [0,0] 


(1) 


0+(-0) = 


0-0 






.Z[-0] 


(2) 


(-0)+0 = 


0 


2! 




.CA1 


(3) 


0+(-0) = 


0 


1. 


s! 


.RE1 


(4) 


-0 = 0-0 








.TH3[0] 


(5) 


0-0 = 0 




4. 


5. 


.RE1 


(6) 


-0 = 0 




6. 




.CE1 


(7) 


0 = -0 





(1 

(2 
(3 
(4 
(5 
(6 





.LT[0] 


(1) 


0 = 0 




.AE[-0] 


(2] 


0+(-0) = 0+(-0) 




.TH2[0] 


(3) 


(-0)+0 = 0 


3. . 


.CA1 


(4) 


G+(-0) = 0 


2. 4. 


.RE1 


(5) 


0 = 0+(-0) 




.Z[-0] 


(6) 


(-0)+0 = -0 


5! ! 


.CA1 


(7) 


0+(-0) = -0 


4. 


6. .RE1 




(8) 0 = -0 



0 = 0 
0-0 = -0 
0-0 = 0 
0-0 = 0-0 
0 = 0-0 
0 = -0 



K (1) 



L (1) 



M (1) 



N (1) 



I 

I 
I 

I 
I 
I 
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TABL E 7. ,30 
FOURTH PARTITION FOR PROBLSM 415044 



z 


N 


LT 


AE 


SR 


CE 


RE 


CA 


AI 


TH1 


TH2 


TH3 


TH4 




0 


0 


0 


0 


0 


u 


1 


U 


U 


U 


(J 


4 

1 


1 




0 


0 


0 


0 


0 


1 


1 


0 


0 


0 


0 


1 


1 


B (1) 


0 


1 


0 


0 


0 


0 


2 


0 


1 


0 


0 


0 


1 


C,J (4) 


1 


1 


0 


0 


0 


0 


2 


1 


0 


0 


0 


1 


0 


h (1) 


1 


0 


1 


0 


0 


1 


2 


1 


1 


0 


0 


0 


0 


r^O 12) 


0 


0 


0 


0 


0 


1 
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0 
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1 


0 


1 


1 


G (1) 


0 


0 


0 


0 


0 


0 


1 


1 


0 


1 


1 


0 


0 


i< (1) 


1 


1 


0 


0 
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2 


1 


0 


0 


0 


1 


0 


L (1) 


0 


0 


1 


0 


1 


0 


2 


0 


0 


0 


0 


1 


1 


i'i (1) 


1 


0 


1 


1 


0 


0 
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2 


0 


0 


1 


0 


0 


N (1) 


1 


0 


0 


0 


0 


0 


1 


1 


1 


0 


0 


0 


0 


P (1) 
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CHAPTER EIGHT 



8>1 - INTKOUUCTIOW 

The study discussed in this dissertation was 
essentially exploratory^ The initial purpose was to 
evaluate tlie LIS curriculum along one of its dimensions, 
variability in student proof s. In order to do this, a 
classification procedure was developed and usod to ruiasure 
variability in a set of student proofs. 

The classification procedures described in Chapter III 
allow us to compare student proofs at five levels of 
detail. These techniques have proven adequate for tl\i3 
study, and should be useful in a wide range of related 
studies 

The classification procedure was also used to 
investigate the relationship between the variability 
(number of classes of equivalent proofs) in a sample of 
proofs for a problem and the chsuracteristics of the 
problem. The results for this part of the study provided 
increased understanding of both the sources of variation 
within the curriculum and the properties of the 
classification procedure 

8^ VARIABILITY OF PROOF BEHAVIOR IN THE CURRICULUM 

The derivation problems in the algebra part of the 
S tanf ord Logic-Instructional System (LIS) curriculum have 
been used in this study. The measured variability within 
this set of problems is high for all five partitions, and 
increases from one lesson to the next. 

Even for the fifth partition, which requires that two 
proofs use different sets of rules if they are to be put 
into distinct classes, there is a substantial amount ot 
variation in the final lessons considered. Under the first 
partition , identity of the proofs (except for error steps) 
is required; using these criteria there are a large number 
of proof classes for almost all of the problems studied. 

LIS will accept any valid proof for a problem. It 
checks the validity of each step rather than comparing the 
student' s proof against a preset standard. In 
investigating the extent to which the curriculum makes use 
of the system's ability, to recognize euiy valid proof, all 
variations in student proofs are relevant, including the 
existence of unused steps and differences in the order- of 
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steps. The first partition is sensitive to these 
v-^iriations and under it there were a large number of 
c3 asses for most problems. The current LIS curriculum 
certainly encourages a large cimount of variation at this 
level; it continues to encourage a reasonable amount of 
variation even as the criteria for equivalence are relaxed 
from the second to the fifth partitions. 

8^ REMARKS Oa THE CLASSIFICATION PROCEDURE 

The ambiguity in the notion of "different proofs" had 
to be resolved to conduct this study. The differences 
relevant to the evaluation of the curriculum are defined by 
the differences allowed by LIS, but there is no unique 
definition of different for a general investigation of 
variation in proof behavior. 

To some extent, any instrument (the classification 
criteria), that is used to measure variability in proof 
behavior, will determine in advance the character and 
extent of variation found in a given set of data. A 
formalized classification has been employed in this study 
to insiare consistency, but automation of the decision 
criteria, however, does not eliminate any bias resulting 
from selective sensitivity to certain differences between 
proofs and Insensitivity to all other differences. In 
fact, the results of this study show that both the amount 
of variation found for the curriculum as a whole and the 
relationship between variation and problem characteristics 
are quite sensitive to the criteria chosen; there are 
marked changes in the results of the regression analyses 
from the first partition to the fifth. 

The use of a nested sequence of partitions rather than 
a single partition limits the possibility that the 
variation observed was the result of an unpropitious choice 
of critera. The first partition requires that proofs be 
identical, except for errors. The only requirement for 
equivalence under the fifth partition is that proofs use 
the same set of rules. These five partitions use a wide 
range of criteria; it is very unlikely that the results are 
due to a peculiarity of the classification procedures. 

However, it is possible that equivedence criteria 
defined along some other dimension would show a different 
pattern of results; for example, the latencies to various 
steps of the occurrence of certain types of errors might be 
used to study additional aspects of proof behavior. Since 
these dimensions of variability are not relevant to the 
present evaluation of the LIS curriculum, they are not 
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considered here* 

The criteria defined for this study depend on the rorm 
of the completed student proof s» Examination of the data 
indicates that the assignments of individual proofs to 
equivalence classes are reasonable. The pattern or change 
in the results from one partition to the next is clear, and 
it is unlikely that any small change in the definition of 
equivalence would significantly modify this pattern. 

i. The definitions of equivedence developed here have 

turned out to be highly satisfactory for two reasons. 
First, examination of the partitions (over sets of student 
proofs) generated for a sample of problems indicates that 
I the formal definitions of equivalence match intuitive 

notions of equivalence quite well. Second, the analysis 
that used the formal definitions confirmed tne general 
expectations about the curriculum, but led to a much deeper 
and more detailed understanding of the nature of the 
variability found in student proofs, and, in addition 
i several unexpected properties of the relationship between 

! curriculum structure and variability in student proofs were 

discovered using these techniques. Although the 
J equivalence criteria used in this study are defined 

i explicitly for the Stanford Logic-Instructional System, the 

general technique would be applicable to most formal 
problem— solving tasks» The development of this new 
J technique for ctnalyzing student behavior is probably the 

' most important contribution of this research. 

: VARIATION IN THE SAI^IPLE OF PROOFS 

I 

The results discussed in Chapters V and VI indicate 
_ that variability in proof behavior can be predicted quite 

\ well from the known characteristics of a derivation 

^ problem. The first four variables to enter the equations 

generally account for about seventy-five percent of the 
1 variance in the dependent variable. These results must be 

I interpreted with caution, since tho study described here is 

exploratory and non-experimental. There is no control 

group and neither the subjects nor the problems were 
I selected at random from a specified population. Thus, 

•* statistical infcarence to a larger population is not 

appropriate. Strictly speaking, the results apply to the 
1 population of students included in the analysis. 

However, the results may tentatively be extrapolated 
j| to other student populations and other curricula, ihe 

I criteria for reasonable extrapolation should be the extent 

" to which the tasks and the population in this study are 
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representative of the target tasks and oooulation. 
Decisions about the reasonableness of such extrapolations 
will depend on the characteristics of the particular target 
population and curriculum. 

There seem to be two distinct types of variation in 
the sample of proofs. The first type of variation 
involoves differences in the order in which rules are used. 
The ntimber of .steps in the standard proof for a problem and 
the extent to which these steps are interdependent are good 
oredictors of the extent of this kind of variation for a 
given problem. 

The second type of variation involves the rules used 
to prove a formula. The magnitude of this type of 
variations for a paticular problem is best predicted by the 
number of theorems in its standard proof. The number of 
axioms in the standard proof and the number of rules 
available when the proof is reached in the curriculum are 
also good predictors for this second kind of variatioi* 

The importance of both the number of theorems and 
axioms used in the standard proof and the number of rules 
available increases systematically from the first set of 
equivalence criteria, which is the most stringent, to the 
fifth set of criteria, which is least stringent; in this 
progression the partitions become more and more sensitive 
to the second type of variation, involving the rules used 
to prove a formula. 

8.5 CONCLUDING REMARKS 

The most generally useful aspect of this study is 
probably the development of the classification procedures. 
The use of a nested sequence of measures provides a much 
more complete description of the variability found in the 
data than any single measure could provide. The 
classification criteria described in this study are 
specific to LIS, but the general properties of the 
technique depend only on the existence of behavior (proofs 
in this case) that can be segmented into discrete 
component3( steps) chosen from some finite set. Hence, 
similar procedures could be developed for tasks requiring 
<5uch behavior. 
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APPENDIX A 



The material presented here describes the attempts 
that were made to identify patterns of proof behavior that 
characterize groups of students within the total sample* 
Two general types of analysis were used to answer two 
cuestions# First, do students exhibit definite patterns ot 
behavior in the construction of proofs; and second, if they 
do, what are the defining characteristics of these 
patterns? The first analysis was based on the 
classification criteria, and required the development or a 
metric function over the set of students; the second 
analysis was based on new variables* 

The identification of a clustering of students into 
sub-groups would be of general interest in the study of 
htiman problem solving, and would also have important 
practical implications* Attempts arc now being made to 
tailor instruction on LIS to fit the needs of individual 
students* The task of individualizing instruction might oe 
greatly simplified if sequences of instruction were 
tailored for groups of students rather than for each 
individual* 

SECTION I z INTRODUCTION 

In the first analysis to be discussed, a distance 
matrix was defined for each of the five partitions* For 
each partition and each pair of students (Si and Sj), the 
distance was defined as: 

D(i, j) 

M(i,j) = 

N 

where D(i,j) is the number of problems for which Si and Sj 
constructed proofs that were not equivalent, and N is tne 
number of problems for which both Si and Sj constructed 
proofs* Hierarchical clustering (HICLUS) was then used to 

group students on the basis of this metric, and the proofs 
for each student in each cluster were examined to determine 
the characteristics of individual proof behavior that 
exolain the cluster ings* Using these tecliniques, no clear 
indication of the existence of proof styles was detected* 

For the second analysis, pattern variables (such as 
the frequency of theorem use) and efficiency variables 
(such as the number of lines per proof) were defined and 
computed for each student by averaging over the problems* 
For both sets of variables, attempts were made to cluster 



■171 



167 



students on each variable and then on all of these 
variables together. These analyses indicated that stroivj 
individual differences vlo exist between students but there 
was no clear pattern in the differences observed. 

These results were not unexpected. The problems in 
the logic curriculun r.re too heterogeneous for this type of 
analysis, and differences in proofs from problem to problem 
were much more pronounced thaui the differences between 
students for a given problem. The methods developed for 
this part of the study, however, make possible a more 
systematic auialysis of problem solving behavior and should 
be useful in future studies dealing with problem solving 
behavior. The results indicate that a more hanogeneous set 
of problems must be used if interpretable patterns of 
behavior are to be identified. 

For the benefit of those who might wish to undertake a 
similar analysis, a description of the techniques that were 
used is included here. 

SECTION 2 z METRIC ANALYSIS 

A natural, extension of the procedures which partition 
the set of proofs for derivation problems allowed a 
systematic exeunination of the data for indications that 
students could be characterized by the patterns of their 
oroof behavior. The criteria (partitions) developed in 
Chapter III specify whether or not the proofs produced by 
any two students, for a particular problem, are equivalent. 
These techniques have been developed further in an attempt 
to d e terrain e whe ther the me thod s employed by any two 
students in constructing proofs to a sequence of problems 
are, in part, the same. 

It was possible, of course, to examine the student 
proofs looking for evidence that indicates the existence of 
such patterns and this was, in fact, done. Unfortunately, 
the fact that a large nxamber of rules were available to the 
students provided the opportunity for many minor variations 
and tended to obscure any genered patterns in the proofs 
constructed by the students. It _was hoped that an 
automatic procedure that focused attention on the possible 
existence of such patterns would facilitate the search. 
The procedure which was used for this purpose is described 
below. 

Assume that we have a set of n problems, 
P= {p(1 )f • • • »p(n)}f and a set of t students, 
S = {s(l)f*»s(t)}j for every p(i) in P and every s(j) in 
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S, s(j) constructs a proof for p(i). We adso assume that 
there exists a partition on the set of t proofs for each 

of the n problems. The metric matrix defined below was 
computed separately for each of the five sets of 
cla ss i f icat ion cr i t er ia • 

Let s(i) and s(j) be any two students in S. Let 
D(i,j) be the number of problems in P for which the proofs 
of s(i) and s(j) are not equivalent, and let 
M(i,j) = D(i,j)/n. It is clear that for all s(i),s(j) in 
S, M(if j) is greater than or equal to 0, and M(i,i) = 0. 

If the proofs for s(i) and s(j) are not equivalent in 

D(i,j) cases, and the proofs of s(j) and s(k) are not 

equivalent in D(j,k) cases, then the maximum number of 

problems where the proofs of s(i) and s(k) are not 
equivalent is D(i,j) + D(j,k): 

D(i,k) leq D(i,j) + D(j,k) 

or 

M(i,k) leq M(i,j) + M(j,lc) 



The five matrices defined here (one for each set of 
classification criteria) cure metrics defined on the set of 
students. M(i,j) is a measure of the distance between the 
student, s(i), and the student, s( j). It has its minimum 
value when s(i) and s(j) fall into the sa'ne equivalence 
class for all problems; then I4(i,j) = 0. It has its 
maximum value when when s(i) and s(j) are in different 
classes for all n problems, and in that case 
M(i,j) = n/n = 1* M(i,j) is a metric on the set S. 

A measure of distance which takes into account the 
number of different proofs for each problem 1s: 



w(p)*e(p,i, j) 
M(i,j) = 



where e(p,i,j) is equal to 0 if Si and Sj gave the same 
proof for problem p , and otherwise is equal to 1, and 
w(p) is the number of different proofs constructed for 
problem p • To corrrect for missing data, w(p) is set 
equal to 0 if either Si^s or Sj^s proof for problem p , is 
raissing# This improved definition of the distance matrix 
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was suggested b^/ Stanley Sclove* 

The HICLUS program, develooed by C. Johnson 

(Johnson, 1967) was used to analyze the metric matrices for 
the full set of problems and for the subsample of problens 
that appear after the introduction of RE and that do not 
contain premises. The input data for this program consist 
of a metric matrix, M(i,j). The output is a sequence of 
stages or levels of clustering. At the first level, each 
student constitutes a distinct cluster. At each subsequent 
stage the two clusters with the shortest distance between 
them are combined into a single cluster until all of the 
students are in a single cluster. 

After each stage of clustering, it is necessary to 
redefine the distance matrix unambiguously, since the 
number of clusters decreases by one at each stage. The 
oroperties of the clustering algorithm are determined by 
the way in which this new matrix is formed. 

For the analysis described here, Johnson's "inaximum 
Method" was used to form the new matrix at each stage. 
This method insures that the largest of the distances 
(defined in terms of the original metric matrix) between 
any two points in any cluster is a minimum. If we restrict 
ourselves to three dimensions and think of each cluster of 
points as being enclosed in a sphere with the smallest 
Dossible radius, we have n-k spheres after the k-th stage 
of clustering. The diameter of the largest of these 
spheres is less than the diameter of the largest sphere for 
any other set of n-k spheres that enclose cill the points of 
the sample. A more detailed discussion of HICLUS is found 
in Appendix 3» 

This method generates n stages of clustering for any 
distance matrix and it was necessary to decide which, ii 
any, of these clusterings should be the basis for 
subsequent analysis. There are two conflicting criteria 
that must be resolved in choosing the appropriate 
clustering. First, the intracluster distances should be 
small compared to the intercluster distances; the clusters 
are then geometrically well-defined. Second, the number of 
clusters should be small compared to the niamber or 
studeits; if the number of clusters is not much smaller 
than the number of points, clustering does not contribute 
to the analysis. 

HICLUS provides information on both of these criteria 
at each stage of clustering; it gives us the membership of 
each cluster and the diameter of the largest cluster. Tue 
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value, a(k), is the diameter of the largest sphere at the 

k-th stage of clustering, and is a monotone increasxnr; 
function of k, A sharp increase in a(k) between the i-th 
stage and the (i+1)-th stage indicates that the i-th stcicje 
of clustering is a promising candidate for further analysis 
since we must accept niuch less compact clusters in order to 
decrease the number of clusters beyond the i-th stage. 

The results of this analysi s were not encour a*3inci. 
Since HICLUS would have generated clusters even if the 
distance matrix had been randomly generated, the clusters 
that it did generate for the data in this study could not 
be accepted without further justification. ;>ione of the 
clusterings generated met the two criteria nientioned above, 
and none of these clusterings were readily interpretable in 
terms of the actual proofs in the data. 

To facilitate the interpretation of the output oi 
HICLUS, a complementary technique, multidimensional 
scaling, was also used. The objective or multidimensional 
scaling is to find a distribution o£ n points in 
k-dimensional Euclidean space that gives the bea*. 
aporoxination to the n by n distance matrix. IIJSCAL ( cr... 
multidimensional scaling program) accepts as input an 
n by n distance matrix and a specification of the number 
of dimensions to be used. 

Therefore MDSCAL can be used to generate a two 
dimensional representation (K=2) of a distribution of 
DOints that yields the best approximation to our aistancc 
matrix. This approximation, however, may be a poor one 
because, in general, it requires an n-1 dimensional 
distribution of points to reproduce exactly an n by n 
distance matrix. If the distance matrix can be reproduced 
from a distribution of 26 points on a two diniensional 
hyperplane, then a graphic representation of the clusters 
can be preoared from the results and the data can ue 
examined visually for evidence of cluster ing« While 
determining the hyperplane that gives the best fit, MDSC/iL 
also calculates how good the approximation is, and this 
neasure, the stress, can be used to decide whether tne two 
dimensional apnroximation is good enough to be taken 
seriously. 

The two dimensional representation of the datn 
obtained in this way did not indicate the existence of ani 
clusters. If geometrically well-defined clusters hac. 
existed, then I would have attempted to determine tn . 
characteristics of individual behavior that accounted ioi 
existence of these clusters. This would have been dont 
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by examining the oroofs for the students in each group for 
similarities of structure. The important consideration 
here was not just the existence of clusters , but the 
interpretability of the clusters in terms of student 
behavior. 

In this part of the analysis, an attempt was made to 
cluster students without using any predetermined 
characteristics of their proof s. Instead , the metric 
analysis was based on a distance matrix where the distance 
between any two students is defined in terms of the number 
of problems for which they generated equivalent proofs. It 
was anticipated that the interpretation of any clustering 
found in this way would be difficult because the clustering 
was not explicitly grounded in the characteristics or 
stud^t proofs. In order to facilitate the identification 
of the defining characteristics of the clusters, a second 
analysis was used that clustered students in terms of 
explicitly defined pattern variables. The results of this 
analysis were to serve as a gxilde to the metric analysis 
and as a check on that analysis. 

SECTION 3 - PATTERN ANALYSIS 

The second analysis of the pattern of student 
performance concentrated on specific aspects of the proofs, 
defined by the pattern variables. For each of these 
variables averages were taken over the two sets of problems 
described earlier. The pattern variables are listed 
below: 

P1 - the nxjunber of theorem steps per proof 

P2 - the ra*wio of the number of theorem 
steps to the total number of steps 

P3 - the number of axiom steps per proof 

P4 - the ratio of the number of cociom 

P5 - the number of Logical Truth steps 
per proof 

P6 - the ratio of the number of Logical 
Truth steps to the total number of 
steps 

P7 - the ratio of the latency to the first 
step to the average latency of all 
steps in the proof 
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The data for these variables were first examined 
individually for indications of clustering. Their 
correlation matrix was computed and frequency histograms 
were prepared for each* This initial examination of the 
data did not indicate the existence of any distinct groups 
of students, where the differences between the students in 
a group were small compared to the differences between 
groups* 

The analysis was then extended to the multivariate 
case by using principle components analysis* The values , 
for each student, of the first two principal components 
were used to plot the distribution of students in two 
dimensicxis* Again, there was no indication of clustering* 

The same analyses were also applied to a second set of 
variables Ccilled efficiency variables. These variables 

were also averages over problems tor each student* The 
efficiency variables are listed below: 

E1 - the number of unused lines per proof 

E2 - the ratio of the number of unused lines to 
the total niamber of lines 

E3 - the number of lines per proof 

E4 - the total latency (time) per proof 

Using the efficiency variables, there was again no reliable 
basis for clustering the students* 

SECTION 4 - DISCUSSION 

Although all of the attempts to cluster students 
failed, the analysis discussed here did highlight one 
interesting artifact in the data* There were three 
students who had unusually poor performances as measured by 
all of the efficiency variables* F.xamination of the proofs 
constructed by these students revealed a consistently poor 
performance starting very early in the curriculum* 

These sane students also tend to have extreme values 
for the pattern var iables^ On P5 (LT/problem) and 
P6(LT/step), these three students have very high values* 
On P2( theorems/step) , they have very low values* 

The students who did most poorly in the curriculum 
show a marked tendency to use Logical Truth even when an 
appropriate theorem is available* Logical Truth is a 
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conceptuaily simole rule that is introluced early in ttapi 
Curriculum. As the more powerful rules, especially tne 
theorems 9 become available, most oZ the students leacn to 
use them where they are appropriate. The three students 
being considered here did not r.^aKe this transition , 

Since their performance was poor relative to the 
average of the other 3tudent?3 evc^n before t\\o Introdactlcm 
of any theorems, it cannot be concluded that the failure to 
incorporate theorems into their working set of rules caused 
the poor performance. This failure, however, did widen the 
gap between the poorest students and the average and 
superior students. 

It would seem then that the pace of LIS is too fast 
for some of the students who are using the system. A more 
thorough investigation of the characteristics of these 
students should be conducted in order to determine the 
causes of their failure. 
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APPENDIX B 



KICLUS 



The KICLUS program, developed by S. C. Johnson 
(Johnson , 1 967) was used to analyze the metric matrices for 
the third stage of the analysis. The input data for this 
program consists of a metric matrix, M(i,j). The output is 
a sequence of stages or levels of clustering. At the first 
levei, each student constitutes a distinct cluster. At 
each subsequent stage the two clusters with the smallest 
distance between them are combined into a single cluster 
until all of the students are in a single cluster. 

HICLUS begins its analysis with the weak clustering, 
C(0), in which each student defines a separate cluster. If 
a( 1 ) is the smallest non-zero entry in the distance matrix, 
then the two clusters that are separated by the distance, 
a(1), in C(0) are combined to form a single cluster in 
C(1). The value of C(l) is defined to be a(1). 

If the distance between any two clusters in C(1) is 
defined ion ambiguously, a new (n-l)X(n-l) distance matrix is 
defined for the n-1 clusters in C( 1 ) . The clustering 
nrocess cem then be continued by combining the closest 
clusters in C(l) to form C(2), with value, a(2). After n 
steos, all of the students havs been combined into a single 
cluster, C(n), with value a(n). 

The problem is to define the new (n-k)X(n-k) distance 
matrix that results after the k-th stage in clustering. If 
X and Y are the two clusters in C(k) that are combined into 
a single cluster, [X,Y] , in C(k), what is the distance 
between [X,Y] and any other cluster, Z, in C(k)? Johnson 
offers two possible answers to this question. 

For the, "minimum method", the distance, in C(k), from 
[X,Y] to Z is defined to be the minimum of the distances 
from [X,Y] to Z and from [X,Y] to Z in C(k-1): 

d([X,Y],Z) = min[d(X,Z),d(Y,Z)]. 

For the, "maximum method", the distance is defined as: 



d([X,Y],Z) = max[d(X,Z) ,d(Y,Z)]. 
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Each of these definitions has a clear geometric 
interpretation. Johnson (oo.cit* p249) outlines this 
interpretation in the following way; 



If we are given a clustering obtained by the 
Maximum Method, we may present the value of the 
clustering as follows: for each cluster in the 
clustering, compute the diameter of the cluster 
(the largest intra-cluster distance). For a 
given Maximum Method clustering, the value of the 
clustering is the maximum diameter of the 
clusters in the clustering. At any stage, the 
distance from the object/ cluster x to the 
object/clustr y is exactly the diameter of the 
set X union y. This gives us a simple means of 
visualizing the clusterings- the Maximum Method 
attempts at each stage to minimize the diameter 
of the clusters* 



The geometric properties of the Minimum Method are 
slightly more complicated, and are discussed in some detail 
by Johnson. Since I did not use this method, and since 
Johnson discusses it in detail, I will not describe it here. 

HICLUS has two additional advantages that shoula be 
mentioned. First, the input consists of the n(n-1)/2 
distances between the n objects; the algorithm does not 
require that the n points be represented in Euclidean space, 
and will accept the metric matrices defined in Chiapter iv 
without further processing. Second, the results are 
invariant under monotone transformations of the metric data. 
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